ICP & Point Cloud Registration - Part 2: Unknown Data Association (Cyrill Stachniss, 2021)

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
so welcome to the second part of the lecture on the iterative closest point algorithm or looking into the registration of point clouds and today we are looking into the realistic aspect of the problem that we have an unknown data association that means we do not know precisely our correspondences in the first part of the lecture we looked into the problem of point chart registration with given correspondences and now we are looking here into the fact that how it looks like when the correspondences are not given and this will lead us to a point cloud registration method that allows us to align two different point clouds and let me illustrate that with this example so you have two point clouds of that bunny example over here and the approach will try to find transformation so that these points clouds will overly overlap as close as possible as you can see here the blue and the red point cloud perfectly overlay each other and that's kind of the result that we want to obtain we can not only do this for these artificial objects we can also do this with surfaces for example of landscapes here and find an alignment so that those surfaces will overlay and we do this by finding a transformation typically a rigid body transformation that tells us in this case for example how that yellow surface should be rotated and moved so that it best aligns with the grade point cloud and that's what we're going to talk about in this lecture so the goal is to find the parameters of a rigid body transformation that overlays the two point clouds or surfaces with each other and we can do this based on an optimization procedure this can either be done an approach which is based on the singular validity composition there's also what the icp icp approach uses and that's what we basically derived in the first part of that course or you can use a robust least squares approach performing iterative error minimization that is also very popular today in those registration techniques and again part one was looking on known correspondences today we look into the question how can we actually estimate those correspondences um and still get a good registration so as a one-minute reminder what we have done in the first part of the lecture when we are looking into the known data station part of it and in this case we have derived actually an efficient to compute algorithm that is a direct solution meaning doesn't require an initial guess and it is optimal that means there is no better alignment so this task can actually be considered as being solved because we have all the desired properties that we want but it only holds four known data cessation so assuming we know the correspondences which point in point cloud number one corresponds to which point in point cloud number two then we can actually go ahead and compute this desirable solution um how does it work so just as a very short repetition our formal um problem definition said we have two sets of points uh y and an x n we have n of those points and they are corresponding with each other so 0.1 in the set y corresponds to 0.1 in the set x optionally we also have weights that means we can say a correspondence is more valuable for us than another correspondency and then we want to find the rotation matrix and the translation parameter so that this rigid body transformation can transform the point set x into new points at x bar so that the discrepancy of the distance between the transformed point set and the other point set is actually minimal and for that we provided an approach that computes us this rotation matrix and this translation vector very briefly just reporting the results we can compute a rotation matrix and a translation vector for the rotation matrix what we need to do we need to perform a singular value decomposition or called svd on some matrix h and this gives us three matrices and based on u and v we can actually design our rotation matrix and the svd we need to compute over a matrix h which is the cross covariance matrix of our data points in order to compute that we need to have the mean of the points in point number one and the mean of the points in point cloud number two then we can compute this matrix h and then perform the svd and for the translation vector we basically use those x zeros and y zeros over here and the rotation matrix to define the translation vector and this result reported here is the optimal solution on how to do that and what it basically visually boils down to is we have these two point clouds here that red circle and that blue circle we basically shift them onto each other in the sense that the um location of the weighted mean is identical and then we are basically rotating one cloud around this um the the center of mass on which we align them uh in order to find the orientation alignment and this then provides us with an aligned point set and we've shown that this is an optimal solution in the last lecture so check out the previous lecture if you're interested in why this is actually the optimal solution today we will dive into the second part saying what are we now doing if we don't know our data cessation if we don't have perfect data stations unfortunately there is no direct or optimal solution to this type of problem which is sad to some degree but that means we still need to find a way on how to solve this in practice because the chart registration problem is an um important problem in practice so we need to provide a solution for that but it typically leads to an iterative solution means a technique where we need to have an initial guess and the optimality is also something which is debatable because in the end we don't know our data association and in the general case we can't come up with an optimal solution in here okay and this an approach which tries to estimate the data association and then performed the alignment given the techniques we have discussed in the previous lecture is the so-called iterative closest point algorithm or also called icp it's around things nearly 30 years from now on and it's an absolute standard technique used in different disciplines not just in robotics or computer vision but also computer graphics and large number of other disciplines in order to align point clouds it's kind of one of the standard algorithms that is being used today so the problem however is that if we don't know the correspondences so which pair in point cloud number one corresponds at which point in point got number one corresponds to which point in point got number two we need to estimate estimate those correspondences so if we don't know them we can't find the optimal parameters of that transformation in a single step that's something which is not possible so we can't easily align those two surfaces for example what we however can do we can guess those correspondences and then use our previous approach in order to compute an alignment based on the guesses that we have done and if the guesses for data cessation indicated here by these dark blue lines are not too bad we may get an alignment which may not be perfect but which is much closer to the situation with which we started so that the idea is that we can iterate this procedure and then start from this guessing our data stations again and continue this process multiple times until we have aligned surfaces and that's already the core idea of the icp algorithm which is actually a fairly simple but effective algorithm for this type of problems so the idea is to iteratively estimate the data association or guess the data association and then assume this data station to be correct and compute the transformation for example using this svd-based approach that we have discussed in the previous lecture and then simply repeating this process so given the better alignment that i hopefully have by now i'm re-evaluating my data association see if i can find a better data association if the data station has been changed given my current estimates i compute the transformation parameters again and basically iterating this process over and over again and this is an effective approach iterative not direct anymore and the initial gas matters um so that's kind of sub-optimal in the sense but the best thing we can do if we don't know our data station the underlying assumptions underneath this is typically that we have some form of initial guess so we have some idea how the point clouds are aligned it doesn't have to be perfect but if we are very far off the probability that we're not going to converge is actually high so we assume to have an initial guess either about the point locations in a common reference frame maybe with some uncertainties involved because for example i have a moving platform which drives through the environment and based on where it is given wheel encoders or wheel ticks we can estimate where a scan has been taken and this can serve as an initial guess or alternatively we have some ideas about the potential correspondences that we may see this can be due to the fact that there are some distinct points or features in the space and based on those features we can actually guess correspondences there's another alternative but one of those initial guesses i typically need to have at least if those point clouds are forming or having scanned kind of complex structures okay and so the key idea is again in every iteration pick a data association and we typically do this in the simplest form by saying for every point in point number one give me the closest point in point cloud number two and that's my data association that's what also gave the icp algorithm its name the iterative closest point algorithm and then we compute the rigid body transformation we perform the alignment and we repeat this process until the data association doesn't change anymore because then i have converged to my solution again that works but it only works if my initial configuration or my initial guess or my initial guess of correspondences is kind of close enough and this kind of close enough is somewhat hard to specify in the first place so it's hard to say initially if the algorithm is actually going to converge so just as a small illustration with pictures or beautiful images and we can illustrate the icp algorithm that we first say we select points in one one of the measures or in one point cloud depending if i'm using a point cloud or some surface representation and then i'm looking for correspondences in the other mass or on the other point cloud here illustrated by those lines and then i'm computing the alignment based on those correspondences so this step assumes that those correspondences have been correct and then i'm typically i'm in the state that i can select points in a better way and then i'm basically iterating this process over and over again until my data association doesn't change anymore and that's kind of the icp algorithm in images illustrated in that form we can also look how what code looked like a daily solar code that would describe the icp algorithm so given the notation that we had before we can say okay we start out and defining our point set x bar which is our first initial guess is the original point set we have some error term e which we set to infinity and then we basically have a while loop over here which describes a different iteration so while e has decreased and is still larger than a certain threshold under which i want to for example bought the process i'm guessing my correspondence or determining my correspondences based on the current information that i have the current estimate of where the point cloud x is and then i'm computing my transformation transformation parameters based on the correspondences so this is basically running part one of that lecture and then i'm running the alignment i'm recomputing the error term with my objective function so that i get a new value for e and while e is still decreasing um and is larger than a certain threshold i'm repeating this process i'm guessing my correspondences again and now i'm using the updated point set points that i've updated in the previous iteration to hopefully find better correspondences and i'm iterating this process until i'm done and as soon as i don't find a better data station the error will not decrease anymore and then i've found a solution again the bunny example they had we had initially here this is our initial configuration we really look for the closest point just in euclidean distance between the blue and the red point set and then we perform the alignment and align this bunny over here and this is here done generated a simple project called simple icp which you can find online programmed in different languages and it's a very simple form of the icp algorithm for educational purposes in order to easily play around with this and this is some is this isn't some something that we call the vanilla icp algorithm so the standard approach which is fairly easy to implement and it works well if you have a good initial guess of those point sets but depends on your initial configuration you may require a large number of iterations until you converge to results so you've seen this in this bunny example already you're aligning it and the results actually get close but until it fully converged it actually takes a while and the second disadvantage is if the correspondences are bad that may seriously impact the result that you get so you may get wrongly aligned point clouds and in order to overcome those downsides of the vanilla sap algorithm a large variety of different icp algorithms has actually been proposed in the past so it's a large number of algorithms try to very vary parts in this for example how to look for correspondences which points to take for the alignment in order to overcome those issues either generating faster algorithms or generating algorithms which are more robust with respect to the initial gas and still converge to a good result overall and the goal for the remaining part of the lecture is to introduce you to some at least of the popular variants of the icp algorithm and tell you which knobs you can actually turn in order to come up with a robust registration technique and we can actually break that down into different things that we can do or different variants of the icp algorithm that change different parts in here the first question you may ask yourself is which points should you consider should you consider your whole point cloud if you have very large point clouds for example those recorded with a terrestrial laser scanner this may be too time consuming to do all the operations you may want to restrict yourself to just a small subset of the points and the question is how to choose that subset this can substantially impact your result then you may want to consider different data cessation strategies is the closest point strategy the best one no it is not there are other ways how you can actually do that in a better way you may want to weigh your correspondence you're saying oh with one about one correspondence i'm more certain that it's correct so let's give it a higher weight or you say one is likely to be wrong maybe you want to reduce the weight maybe even put it to zero and fully reject it um so that actually the step three and four rejecting potential outliers um can be done why are those weights and basically eliminating those which are potentially wrong and only performer optimization optimizations on the other parts and again what people want to optimize with those different variants is speed so generating faster algorithms more stable solutions being more tolerant have more higher tolerance with respect to sensor noise for example or outliers or my initial configuration in terms of what's called the basin of conversion how far can you be away from me from the final configuration and still find the precise optimal on your optimal parameters so your maximum initial misalignment is something which is important for real world applications and different variants optimize different aspects of this performance criteria so let's have a look and go through those four steps one after the other and see what different variants are out there and what their impact on the result will actually be so let's start with considering different subsets of points so there are different strategies out there the naive approaches use all points right just don't throw away any information that's kind of the standard approach if you want to take all the information that you have into the account if you have too many points you may want to consider a uniform subsampling of those points or generally subsampling and this can be done in a uniform way that means that you have a uniform typically refers having a uniform distribution of points uniform density of points all over the space and that already means you choose not just kind of whatever every tenth point but you were constraining this uniformly in this typically in a spatial description that means points which are nearby should have the same density than points in a further way and given your the principle of your sensor typically it generates a higher density in areas of points where they are nearby compared to areas where the which are further away from the sensor so you want to sample differently depending how far the the object that caused the reflection was actually away from your sensor or you just perform a random subsampling or you may want to do things like taking features into accounts so you know this from for example working with camera images you can extract key points and feature descriptors and maybe you want to select features based on certain descriptors or features which have certain properties maybe only points which are good for registration maybe corners are good things um and so you may want to constrain your sampling to points which are potentially good for the alignment that's also a strategy that you can go for or you go to something which is called normal space sampling especially if you have large smooth surfaces with only small variations of the normal normals you want to actually sample more densely in areas where the normals are changing in order to get a good alignment so they're different variants how you can reduce the number of points that you're considering in order to get the best possible alignment so one example over here is kind of uniform sampling this is work done by andreas nichter where you take 3d scans from an underground environment and you incrementally align those points you always see the blue point cloud is the current observation which is then registered into the existing point cloud and with this you can actually generate accurate maps of the environment taking only a subset of the points into account for the registration you can compare the uniform sampling with respect to what was i recently referred to as normal space sampling so instead of um sampling your correspondences to the point you want to consider rather uniformly over the space you may want to vary this depending on the normal or based on the curvature of this point so whenever the normal changes um you want to do that in the space of normals and this can be advantageous especially for situations where you have large flat surfaces so planes large planes and only small variations so small areas where the normal changes because there are objects sticking out or curvatures in certain areas or carvings for example and you want to register those you want to give a higher weight to those areas because on the surface itself you can actually slide quite well and this is known since quite a long time so this is an example where you randomly subsample your points and here you do some sampling in normal space and what happens is here you're basically randomly putting points over your surfaces it's basically a plane just here with very small structures but if you put a higher weight into those structures you will actually get a better alignment over here because you have a higher number of points which are curved in this curved regions you can also zoom in a little bit you can see it better and then you get a better alignment over here compared to this example where this bar should actually be over here so taking this normal space into account for sampling can help you in situations where you have a lot of flat surfaces and only small areas where this varies this more specialized application in more general scan matching algorithms you typically don't perform this normal space sampling anymore depending on your environment if you know something about your space you may want to do a feature-based sub sampling that means you only take into an account a small subset of points and compute an initial alignment for that so this is an example of a 3d scene where you have some features extracted also doing some classification on what you're seeing if this is a building surface vegetation or other poles and then you're basically sampling points based on how interesting they are in that feature space so to say in order to get a small set of highly distinct points that you can then use for your registration that simplifies your search for correspondences and and can also lead to better accuracy because the likelihood of making a mistake is larger under the assumption that your feature extractor was very expressive of course if your feature extractor is suboptimal then there is no gain for you in here so there are different variants how you can subsample your points often especially in robotics applications where we have not too dense point clouds but getting the point clouds at a higher frequency for example a vehicle driving through the environment generating point cards with 10 hertz or 20 hertz the density is typically not too large and then we most of the time take all points into account or all points for which we likely to have a corresponding partner um but this may be different in applications where you generate highly um dense point clouds like terrestrial laser scanning um and perform the alignments based on that then i want to look into the second part um so what about the data cessation strategies icp itself just computes the data station based on a nearest neighbor criterion and that can work well but hasn't doesn't has to work well and i would actually say that finding the right data stations is one of the most tricky parts in here so it's it's worth spending a lot of brain power on how to do this data cessation strategy well in order to get a good alignment because remember once we have the right association we actually have the optimal solution easily being computed so the data station strategy or your way for defining your correspondences for estimating your correspondences will have a substantial impact on your result in terms of convergence and in terms of speed especially if you need to re-estimate your data stations all the time then you need a large number of iterations until you converge but also it's important for the quality of your results the better your data station you're better the final your final result will be and again there are a lot of strategies closest point is basically the standard icp approach you can do other things like closest compatible point if you know more about your points let's say you also have a camera on board and see the color value of a point you can say for example i don't want to match red objects with red objects and not red with blue for example so you can define a compatibility based on additional information on points and take that information into account to only search for correspondences in areas where they are kind of compatible you can do things like taking normal information into account doing strategies such as normal shooting or point-to-plane metrics we don't consider mapping a point to a point but a point to a plane there are projective approaches which especially became very popular since the we have um rgbd cameras or cameras which also provide you with depth information um they've been they made them popular or popular again these approaches so there's a large variety of those different techniques and actually want to go through them and investigate and tell you why they are useful what their advantages and disadvantages at least to some degree let's start with standard icp the closest point matching and here we have two point sets or two surfaces the blue one and the red one over here and we start for example with the blue point cloud selecting the points of the blue point cloud and looking for the corresponding partner in the red point cloud always looking for the closest point in the other point out the one which is closest to our current location and that's something that we can efficiently do we're for example using search techniques like kd trees where we can comparably quickly look for points which are close at least faster than naive search we would basically have a quadratic runtime taking the points uh in the individual point sets into account we can do this faster with the kd tree so let's say we select the point here this blue point over here and then we look to the um to a point which is closest to this point on the red surface so this may be this point over here and that's basically then we have found a corresponding partner and the other point count saying this is our correspondence this is a standard approach it is generally applicable it's very easy to do the only problem that we have it's not always really good and you typically need a lot of iterations until this approach converges again we've seen this in the bunny in the in the beginning of the lecture it was a motivating example you could see that you need a lot of iterations especially already when you're really close to your result it still takes a while until all the data stations are actually correct and you converge to the right solution what this requires however is that you have an initial alignment of those point clouds if you have no idea where those points clause have been taken looking for the closest point in euclidean space is tricky to do and therefore you need to have a different strategy if you have no initial guess about this alignment and kind of the better your initial alignment the more successful you will be here and this is kind of the standard approach that you should try first or could try first when you're building your icp algorithm at least if you try to find a new metric or propose a new metric this will be always the standard approach to compare to but also a couple of others but it's kind of the the first one which was out there kind of what is called the metric for the vanilla sap algorithm if you don't have an initial guess what to do um if you really have no information at all at hand one thing you could do is you could compute an initial guess based on the center of mass so you basically compute the center of mass of both point sets um so if this is your point set and there's another point set and you have no idea where they are located in the coordinate system you could compute the center of mass and basically shift the center of mass on top of each other you may even do things like computing a principal component analysis to get a first initial guess of the orientation there are things that you could do typically shifting them on top of each other if the first starting point so let's say however you have that situation you have these two point clouds you have no idea where they are what you could do you can shift them on top of each other align the center of mass and then you perform your nearest neighbor search so you always try to find corresponding partners over here and maybe your alignment is illustrated with those gray lines over here these are your estimated correspondences based on a nearest neighbor approach and here basically ignoring distances which are too far away and then what you could do is you perform your iteration your alignment and then this is the next configuration that you have and then what happens you again need to look for correspondences and repeat this process over and over again for whatever 20 30 50 times until you find an alignment but in the end you have the alignment that the blue point cloud perfectly overlaps here with the orange point cloud and you find the correct alignment of these in this way based on just looking for the closest point again this is what we call the vanilla icp algorithm so the closest point works well but if you have more information about your points you may want to only match points which are compatible with each other so for example if you have the example i used before a camera together with your range scan then you may want to take the color information into account not define if those points are compatible or not and that's something that you can exploit you can use other things such as normal information or the curvature or any other information that you have about your point cloud or that you get from different sensing modalities and then match only those points which are compatible so this was the image that i've showed before for the feature based approach you can do similar tricks here with doing certain projections and only matching those points which are compatible with each other and then align only those which are compatible at least in the first couple of iterations in order to get a good initial alignment and then you may take further points into account or the whole point set into account um not only a small subset but this basically strongly depends on the information you actually have available another approach out there is something that's called normal shooting where you take the surface normal of your points into account oh sorry and so if you are here at that surface you basically approximate the surface here look into the direction the normal then shoot out a ray in the direction of the normal see until it intersects with the other surface and then that's the point that you can take into account um has been shown that this gives you slightly better conversion results um if you have a lot of small surface smooth surfaces but whenever you have very complex scenes or a lot of noise in the range measurement that you have then it doesn't perform well it actually gets worse in terms of performance so that this normal shooting is something that is not that popular anymore today what however is out there and what is frequently used especially the rgb d-based mapping systems here to mention for example connect fusion made those call of projective data stations popular again the key idea is that you take a point from your blue surface and you basically project it into the sensor or given the initial guess of your of your sensor location from that surface so it's indicated here by this ray and then see where this ray projected onto the projection center of your camera for example intersects with the blue surface and take this as your corresponding point in order to perform the alignment so this projective data association is also something which you find commonly um most of the time point to plane metrics are being used today the idea of the point to plane metric is still to find the closest point but not take the euclidian distance into account but to take that distance into account and project it onto the normal information that you have so as a result of this if you have kind of a comparison between the kind of point-to-point metric given that you have basically subsampled your surfaces with through point clouds you actually get an error vector um which is not if this is the closest point over here but you basically project it onto the normal information that you have and as a result of this it basically the error vector shrinks here and takes the normal information account for this alignment and this is a technique which is fairly popular and used most of the time today in icp algorithms because it provides typically a better convergence property and what you're basically doing you're still computing your nearest neighbor so if this these two points are the nearest neighbor points so this would be the point-to-point metric over here you basically compute the normal information and then project this distance onto this normal information which gives you these um these values here as your as your error terms over here you can take that into account in your approach and basically it is taking the error vector from the point-to-point metric and computing the dot product with the corresponding normal vector and using this as your cost function so as a comparison point to point versus point to plane so if this is your your um point-to-point metric and then this is the function that you're minimizing if you go to this point of plane metric you're basically using one of those normal vectors to kind of say this is my plane and then basically computing the distance of this point to that plane and this is basically the same error vector that you have over here but you compute the dot product with the corresponding normal information that you have in here and this is kind of a fairly standard technique today that probably most icp registration algorithms use today in practice is this point to plane metric it converges faster an example here again the bunny example so this is basically the iteration that we have seen before and this is the same with the point to plane metric and what you see we are starting out but very quickly the point to plane metric already is generated the perfect result there are no changes here anymore and this one is still iterating so you can see that the number of iterations that you need in order to converge to a solution is larger for the point-to-point and smaller for the point-to-plane metric so you're converging faster to your result and therefore this point to plane metric is frequently used and i would say most techniques today at least for 3d registration use these point-to-play metric i also have a few examples how you can actually do this let's say you have a vehicle driving through the environment equipped with a scanner so it takes a 3d scan of the scene this gives you your point cloud where maybe with some initial guess based on the odometry information then the vehicle moves on moves through the environment at some point in time gets another scan what you can do is besides this odometry information where the platform has moved you take this 3d data in order to compute the full point clouds and register those point clouds with an icp algorithm so that you get rigid body transformations between the locations where the scan has been taken and they can then you can change this information and build maps of the environment so one example over here is a robot mapping task in the roman catacoms where depth camera is installed here in front of the platform and you can perceive the environment and then actually registering the individual depth scans into with a point-to-plane metric in this example and computing a 3d model of the environment you can actually chain up the information and turn them into one consistent point cloud model again as i said before very popular is the projective data association then typically combined with a point to plane metric and this is something which is used for example in connect fusion so one of the mapping algorithms using the connect camera or camera which gives you rgb information and a range of depth information in order to register point clouds or having already built a model and then computing the data association by saying okay where i'm intersecting with the model where is this ray intersecting with the point cloud and then performing this registration using a point-to-plane metric so that's something that is very popular for this let's say connect base or range camera-based mapping systems today but it can also be used in the context of standard lidar based slam for example there's an approach where you have a vehicle driving through the environment building obtaining range data from its lighter scanner while moving through the scene and you're basically registering um 10 times per second your range scans into an also a productive icp approach using point-to-plane metrics in order to come up with a proper alignment of the individual range scan so that you then can build map of the maps of the environment and localize yourself in those maps and plan in those maps this all becomes possible just based on um on incremental scan matching once of course you re-enter a known area you may do also some other forms of post-corrections like loop closers in the context of the slam problem but you can actually go quite far just with incremental registration today so in sum the data association is key for making your icp algorithm work and there are different ways how you can actually find your correspondences or determine your data association and investing into a good data association strategy maybe exploiting additional information that you have is key in order to get good results the better your data station the more likely you will converge and the faster you will actually converge it's always worth exploiting an initial guess so if you have information about the initial alignment you should actually use it and the general rule of thumb those normal based metrics like point-to-plane metric um are typically performed better or are performing better than the standard point-to-point point-to-point metric so most approaches today will use some form of normal information in order to come up with a good alignment you may also want to take into account weights so if you can either say all the points count the same or some points are more important than other points and you can take this weight information into account in the alignment and we have taken that already into account in the previous lecture where we introduced a weight for all of those correspondences and then weighing computing the weighted mean of the points in the different point sets and also for computing the cross covariance matrix and then you can directly take this weight information into account so if you have additional information about which points or which correspondencies are more valuable for your registration problem you can directly put it into this system for example noise information can be taken into account this may be less critical for laser range scans but more critical if you use information for example from a stereo camera and want to do the stereo camera registration because the 3d reconstruction that you build from a stereo setup gives you different precisions or accuracies um depending how far the point is actually weigh based on your baseline information so points which are further away have a higher uncertainty and this is something that you can for example exploit for um for the waiting terms so you would give a higher weight to terms where you know both points and both points that have a smaller uncertainty compared to situations where they have a larger uncertainty because you can trust those points more and they're more valuable for your registration process you can also go a step further and say are there certain prints i should completely discard because they're probably outliers or stem from moving objects in the environment then this is an information that you may want to take into account as well and can take into account quite easily here this brings us already to the last point what happens if i want to reject certain points because they are potential outliers and this is also a strategy that you have to take into account in order to build a robust icp algorithm today you typically can't do that without you will need some form of outlier rejection techniques or down weighing points which are potential outliers the one thing you can do is if you have a good initial guess you can first of all ignore all points which are very far away by saying i have no idea if they are right or wrong let's ignore those points which potentially have a large distance between those point clouds i'm starting with only these kind of correspondences where the difference is small compute the initial alignment maybe those will change as well and later on i will be able to make a good alignment so that's kind of something which is very easy to do based on a large point point distance you are basically rejecting certain um points and not taking these correspondences into account in the current iteration and this is kind of the simplest thing that you can actually do you can also say okay let's look to neighboring points if i have a point in nearby another point are they roughly mapped to the same area on the other surface if they are mapped to positions in the surface which are very far away like in this example over here so they are very the points are very close on the red surface but very far away on the blue surface they are probably not the right ones and so you can come up with different types of compatibilities um taking pairs of points into account and then reject those which are potentially wrong ones you can also do things where it's called trimmed icp ways you simply sort your correspondences based on the error that you have and then say okay for example i am ignoring the worst 10 or the first 20 of the correspondencies and only focus on the best 80 percent um and then come up with a good alignment um the question is how to set this t it's kind of related to how many what's your expected outlier ratio and how much do the scans actually overlap which may be hard to identify um so you need to have an idea of the overlap in order to set this parameter t here correctly so how many points you actually want to ignore if you go for a certain percentage like ignoring 50 of the points um if you have a high outlier ratio and small overlaps this may be too much um that you ignore and or and you can also end up in situations where you will simply not be able to find correct data associations anymore and then don't come up with the improvement of your alignment and you will fail with your registration if you want to do this in a kind of more formal structured way what you can do is you can actually use robust kernels so kernel function that are common in robustly squares for example in order to downway the impact of potentially large errors in your data association and you can either commit towards a fixed kernel whatever there are things such as a huber kernel or something like this that you may want to take into account so this is kind of different kernel so this is a standard quadratic function if you're minimizing the squared errors and these are basically other forms of kernels which give less weight to points which are further away so this gives you then uh corresponding weight factor and this is the weight factor that you can directly put then as this weight p of n into your your error minimization approach and there are also techniques which actually estimate the shape of that kernel online for your problem and this can also be used for icps as for example one example this is a famous example in the so-called kitty highway scene you have a car driving over a highway not many structures around and then other cars are overtaking and then the overtaking car which is actually moving um generates very good features for the registration and if you if the icp algorithm commits to registering against that car then your algorithm will actually fail and if you have these adaptive outlier rejection techniques you are able to say no this is most likely to be an outlier the other results are potentially better and then get good results so this is an example here where the car is actually driving with a fixed kernel which has been tuned so that it's overall okay you see the car driving around here you see the second car coming at some point in time there are no other structures around and this car moving you can see how the registration fails and the car virtually moves backwards because it aligns to the forward moving car which leads to a wrong registration if you're on the other hand using this adaptive kernels and you see the scene coming here here is the car driving further you can see that here the kernel goes to a very low value batch basically means it rejects a lot of those outlier techniques and then comes up with a good alignment so a good strategy for rejecting your outliers is key in order to come up with good results and therefore often multiple things are done um so you on the what people often do is they have some heuristics for rejecting outliers certain things they assume not to happen in their environment and then still combining this with a robust kernel this can be in the easiest case for example a huber kernel but more advanced techniques will use adaptive kernels that adjust themselves to the current situation out there and in this way come up with a better alignment so you may want to ask yourself are there certain things i can rely on are there certain points which are most likely to be outliers or moving objects or can i make certain assumptions about the ego motion if i know that i'm on a car if it sends on a car it's very unlikely that the car moves sidewards for example these are additional constraints i can take into account in order to avoid making wrong data cessations i just provided a few examples so there are approaches which try to learn how moving objects look like and where they are moving in the environment not to identify them and then take them out of the registration procedure so this is one recent example where you have a point cloud over here you can basically see that there are two cars but one car is parking and actually provides really good features for your registration and the other car is moving and if you do this what's called moving object segmentation so segmenting your scan into the moving parts and static parts you can use this basically as a mask for your algorithm ignoring the moving points for your registration and in this way come up with a better alignment and those approaches actually work fairly well so if this is your raw point cloud this is actually a ground truth of the moving object so whatever is red here is moving and these are the estimates you can see there are a few too many objects being identified as moving but all the moving parts are actually segmented out correctly so if you then perform only your data association on the black points ignoring the readily labeled the the points labeled in red you will get a better data association and it's unlikely that you will make mistakes and commit yourself to the moving objects for example so there is one example where you can do this even without learning based approach if you do mapping in dynamic environments so what you see here in this video is the rgbd camera frames or the depth camera and here the 3d model built from the um from the depth information that the camera provides and you can see that you can actually build an accurate model of the environment so this is basically the model projected back into the camera camera frame that works well it works only well as soon as the world is static so here you can see people moving through the environment moving through the scene and this actually leads to registration errors and in the end will lead to a highly inconsistent model of the environment so while the people are moving through the scene you can see the model gets completely corrupted and the estimation of where the camera was and how the point clouds should be aligned screws up substantially so that the resulting model is shown over here is not usable anymore what you can however do is you can look to the residuals of those errors and this is similar to these kernel based approaches and try to identify those points which are potentially dynamic and which are static just based on distance criteria so if this is kind of the model that you have so far and you get a new scan let's say this is the scan that you have you will perform a first initialization using for example robust kernels taking them into account to not be too much subject to mistakes and then you're looking to the different mistakes so these are the residuals that you have between your surface or your model which is here shown in black and the green dots over here and then you can actually look into statistics and say how those distributions typically look like and you will see a lot of errors are here and then there's a small peak over here which are often the dynamic objects in the scene and then you can basically with some threshold do a very simple and straightforward rejection technique say okay those guys are probably the wrong ones and i exploit the knowledge that i have about my problem in order to eliminate them and say this is a dynamic object don't take it into account and do the registration only with the static information and if you do this you are able um to identify which are the moving objects so this is kind of just one single frame of that scene if you print the residuals with respect to the previous model you can actually very well identify where the persons are there also some other structures which are not perfect but you can clearly see where the high residuals are and then with some image processing techniques you are actually able to identify which or can generate a mask which masks out all the dynamic objects so that the registration will only work in those black areas and the here the bright areas will not be taken into account and if you do this um same scene as before but now with this outlier rejection technique and you can see you're building a map you get a consistent model of the environment the projected frames look similar to the real camera frames and here you see persons are identified through this approach not integrated into the model so that you can register your current observation that you have against this 3d model without screwing it up it's not absolutely perfect so there it gets a little bit blurry the alignment is not absolutely perfect but it's much much better compared to the situation that we had before that you can actually use this information in order to build an accurate model of the environment so if you're just integrating those scans you can actually hear the parts of the room which are fairly well aligned given that there have been so many dynamic objects in the scene before and there's another example of that same technique that you can actually use and then also render away this information from the model so you have people moving around and you can actually also remove those objects which had been moving like the people were originally sitting here on the chairs but they identified as being movable and then can even be removed from those models and then basically have an iterative approach between updating the model and improving your poses first and back in order to get accurate models of the environment so in sum this leads us to an icp algorithm which works in the following way first we are subsampling or potentially subsampling our point set to a smaller set of points hopefully um those which are good ones and this again is a step that not every approach does but if you have too many points you may want to downsample your points then you try to determine your correspondences whatever wrist point or plane metric or any other tank that you're using out there you do some outlier rejection and compute some weights given the current knowledge that you have about those points and then you're computing the rotation translation given the approach we have done in lecture number one in order to come up with a solution here you apply that to the registered point cloud you compute the error and then you basically repeat the same process over and over again until the error yes reduces and you're able to come up with a final result so you repeat this process over and over again until convergence and then come up with a good registration and that's basically what the icp algorithm does in its slightly more advanced form and these are ways how you can compute the alignment with this thank you very much for your attention and we are going to con continue here with the third part on the lecture looking into the least squares approach so what can we do if we want to do don't rely on this svd-based approach but a least squares solution thank you very much for your attention you
Info
Channel: Cyrill Stachniss
Views: 5,991
Rating: undefined out of 5
Keywords: robotics, photogrammetry
Id: ktRqKxddjJk
Channel Id: undefined
Length: 52min 50sec (3170 seconds)
Published: Sun Mar 21 2021
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.