Point Cloud Alignment using ICP (See 2021 Video die to audio issues in this video)

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
so welcome to the course today on the first part of our lecture looking into the iterative closest point algorithm so the attractive closest point algorithm is a technique for lining sensor data such as laser range scans and therefore they play an important role within the simultaneous localization and mapping or a 3d reconstruction problem ok so let's have a look why is this important what is relevant in here so the goal of these alignment algorithms is to take points for example from the 3d world but could in theory also be a 2d skin and try to align them so that the points the data points that are observed the 3d points in the environment for example are aligned with each other so they mean they lie in the same position so one example is I take a scan here from the position R am standing consider a robot and observing the environment and I take a second scan let's say from here observing the same part or similar part of the environment then I will have two scans which I've taken from similar positions observing the same part of the environment and the goal is to actually find the transformation so that the end points that are perceived preceding the same object in the environment are aligned with each other so lie on top of each other in the same position if I do this I can basically recover the relative transformation between the two recording positions and this is an information that we will later on use a so-called virtual observations in the graph based simultaneous localization and mapping problem so the ICP approach is technique for solving the alignment problem of data points 3d data points so the goal is to find the transformation between the two points or the two scans so that the corresponding points align what is the corresponding point the corresponding point is a point that I haven't or our points that I have the two scans and they are expected to respond to correspond if they refer to the same point in the 3d world and there are different techniques how we can actually solve this problem so you can treat this is a least squares problem you can do roll asti squares estimation but a very common technique is so called ICP orbit relative closest point algorithm and we will get an idea today how this approach work and how you can actually implement this approach so why is this relevant for us this is very important in the context of mapping or slam so typically you don't have a single sensor reading such as a single 3d niter scan to observe the environment you will record multiple scans from multiple different locations because you want to scan the object or the whole scene in typically one scan is not enough so if you do not know the exact position from where those scans have been taken you need to actually solve the problem on how do you need to shift and rotate the individual scans so that they build a consistent global point cloud so you can see a few examples here where the ICP algorithm have been used so on the top these are building structures taken from a robot equipped with a laser range scanner driving around mapping the scene or here you have shown the individual occasions of a mobile robot navigating through the environment taking local 3d scans and basically stitching those scans to each other in the stitching process its alignment process is the what the ICP algorithm does administration you can see here on this side two surfaces one colored in gray and the other one colored in yellow and you can see how the ICP algorithms actually aligns the scans by it slightly shifting rotating those two scans with respect to each other so that those scans then are going to overlap so you can see this this small animation and you can see through the sequence of shifts one scan or one point cloud has been transferred into the coordinate system of the other point cloud so that the points which correspond to the same point in the environment are shaped the same position so you can see this here in this yellow grey texture which is overlaid and the question is how do we do this how do we generate this process over here and that's what the iterative closest point or ICP algorithm is about it's nagas rhythm which has been around since quite a while so it's not something extremely novel and it's there to align local point clouds or surfaces with each other again this is the example the goal is to define this iterative process this is what gives the name the eye in order to start from this initial location and end up with those lines okay so let's start into the problem and see how that can be solved so we assume to having having two sets of point clouds a point that Q and the point set P and Q is for example taken from the first skin and P is taken from the second skin and we assumed at the moment to also have so-called correspondence information that means we know that for example a point I in the first point corresponds to a point J in the second point cloud or index I in index J that means we have a set of pairs of partners of ID's which tells us a certain point in scan number one corresponds to a certain other point in skin number two and this is this correspondence information or something they also there's a data Association which is a very important part of very crucial part to have that information and what we then want to do is we want to find a transformation that transforms one point cloud into another point cloud so here in this equation you can see we have an error function that you want to minimize and this error function depends on a rotation matrix R and a translation vector T so if we are talking here for example order 3d rotation I mean three rotation and angles and 3d shift translation so three translational component so this would be a six degree of freedom transformation for the 3d linemen problem if we n 2d it would also only be three degrees of freedom XY and one rotational component but let's stick now with the 3d formulation so the error is then given by the sum over all corresponding points so because we're only considering points for which have a corresponding partner in the other skin and then we try to take the points P and Q and find the transformation that we can rotate and shift the individual scan so we rotating the user rotation matrix to rotate the point P and shifted by T so that then they are aligned in the same coordinate frame and we want to minimize the squared error so that means the the the difference between the point and scan number one and the transform point in scan number two into the coordinate frame one and then these distances between those points between all corresponding points although small pairs we want to minimize the sum of their squared okay so here here assume we know those correspondences and if we assume they are correct let's say we have two surfaces the blue surface and the red surface over here and those black lines here indicate those correspondences that means this point in the rats can corresponds to this point in the blue skin and in the same way that here so we want to find the transformation which minimizes the sum of the squared distances of these black lines over here so that by for example taking the red skin and shifting it in rotating and upwards so that it overlaps with the blue skin we would get a solution which looks like this and this is now in this illustration the minimum error configuration because all those black lines actually are zero and in reality of course we don't get them down to zero we always set noise and our sensor data but we want actually minimize the squared errors okay so all the questions how do we actually do that how do we obtain the translation and rotation parameters to turn this situation into this situation and that's what the iterative closest point algorithm is about or let's say what one key ingredient of the iterative closest point is about because this so far is not iterative this is just an alignment step giving those correspondences and so how hard are the work we can do two things we can take the center compute the center of mass of both friends and actually subtract the center of mass from every scan so what happens we could be to the center of mass of scan number one we compute the center of mass of scan number two and then we subtract the point the center of mass from the individual point set that mean both point thirds will be shifted on top of each other so that they have the same center of mass and this is kind of the key step that we do in order to obtain the alignment of the points in terms of the translation then the second part is we need to compute rotational alignment so we you need to rotate those two scans on top of each other so the error in those points actually gets minimized so first let's have a look to the center of mass and can be done quite easily we compute the the center of mass by taking here into account only those points for which correspondences exists so all the other points are ignored we are just doing this for the for the points for which correspondences are available so we sum up over all points in queue and then divide by the number of corresponding points this gives us the mean of the points in queue so the center of mass of queue and we do exactly the same thing for P so iterating summing up all points for which a corresponding partner in the other skin exists and then we obtain dividing it by a number of correspondences or points we are considering so that we end up with the center of mass of the point P and then we can generate two new points s which are the prime bonds Q prime and P prime which is basically the original points that minus the mean so we are basically subtracting the mean from every data point so as a result of this the new point clones the private ones Q Prime and P prime are point clouds where the center of mass is 0 and those have the same center of mass so it corresponds to shifting those points on top of each other so given this center of mass reduced points that's Q Prime and P prime I can actually look at something which is called the orthogonal procrastinates a very end of the original problem and I can show that if I solve the orthogonal procrastination solve my original error function using this set of math reduced point sets so the original error function was this one shown over here so I'm summing over all correspondences and rotate the points P into the coordinate system of the point Q and I want to find the R&T that minimizes this function if I now operate on the set of math reduced point sets I can use an equivalent formulation which is the following shown over here some air MA error function e Prime not only taking the rotation into account so what this expression here means these are all the points from cube stacked next to each other so it's a matrix of those points and that's abstract from this the same matrix generated from the point set P Prime with elect multiplication of this rotation matrix so what does this expression do this rotation matrix which rotates all the points in P still keeps them in this matrix form and then I subtract those two matrices with each other and then I'm computing the Frobenius norm which is the norm obtained by completing the difference between the elements of tomato in this case two matrices but since I've subtracted them it's basically summing up the elements of the resulting matrix and so given that I use the squared norm over here we are obtaining the squared differences between the one points that rotated on the other points so you can see that's actually equivalent for the original problem when I already subtracted the points X and so minimizing this expression commuting R for this term gives me the same R than the one over here and this one I can solve with something that's called the orthogonal procrastinates inefficient solution using the singular value decomposition so how does it work so first I need to compute these so called cross covariance matrix from the 2mass reduced point set so I'm summing over all corresponding points in Q P and Q prime times P prime transposed so notice that compared to the dot product the transposition operator sits you on the outside not on the inside and so this gives me the cross covariance matrix of the point sets and then I use SVD to decompose this point set so I execute the SVD on this matrix W and it gives me three matrices you D and V transposed and the product of UD and V transpose gives me my original matrix W and so the individual matrices have different properties so first this matrix D the one in the middle is a diagonal matrix which contains all the singular values but that's something I'm not really interested in at the moment I am interested in this matrix U and V and these are 3 by 3 rotation matrices in this example over here in this under these conditions and if I multiply them with each other so u times V transpose I actually obtain my rotation matrix between analysis function this holds only under the assumption of course that my cross covariance matrix has full rank so if we talk about 3d points it must be rank 3 they're talking about 2d points must be of Rank 2 and then this in this assumption this matrix R can uniquely be determined and once I have this R I can also compute the translation vector by take transferring the Centers of the center of mass in exactly the same way so this allows me then based on the orthogonal procrastinated composition to obtain R and T and in this way align our point sets so to sum up what do we actually need to do if you want to implement this first we need to compute our cross covariance matrix of the center of mass reduced point sets second thing we execute the compute the singular value decomposition but we are only interested in the matrix U and V transposed and we multiply u and V transposed which gives my rotation matrix R and once I have R I can update my point set P in a way that I reduce the mean of P perform the rotation and then at the mean center of mass of Q which gives in my new points that PJ or P which then is in the coordinate system of MA of Q so I shift P to Q so also as an illustration what happens we have here two points that the red one and the blue one and so we're transferring the red one on to the blue one the first one is a translation that we shift them on top of each other there's a translational part and then we have the rotational part that they were taken top of each other so you see here the blue and the red points are very well aligned you will see some small black lines remaining this is the remaining elements due to noise in my observation process because my observations are not perfect or the points are not equally sampled from the original surface so some small errors will remain but this is the optimal solution that I actually get out so the optimal translation and rotation that minimizes my arrow function and again this works assuming known data cessation so we assume to know our correspondent set and if we know our correspondences this is a way how we can compute our solution the problem in reality is that we don't know this data Association so if you have data station everything is easy if we don't have the data Association it becomes more tricky and this is something that a problem that you have in all real world situations that you have to deal with this unknown data Association and the problem that we have in here is that the correct data sation if it is not known we typically can't do one step in order to come up with the right solution as we did it before what we hayver can do we can try to guess the data Association and then performing a correction giving this guessed data Association in the hope that the the new alignment is better than the old one given that our guess of the data session was not too bad and then use an iterative process to come up with the solution so all that work so consider you have this blue surface and the red surface you take your points on the blue surface and look for the closest point on the red surface so for this point the closest point on the red surface is this one for this point the closest point on the red surface is this one and so on and so forth this is obviously not the right edge Association but the hope is if we start with this data Association as unit as an initial guess and perform our Corrections we are kind of getting better and better step by step and so if I have this data Association and I compute the SVD solution as before assuming is the right edge Association I will come up with an alignment which looks like this but you can see it's already better than what we had initially and so this the idea that this is a better setup than this one so we can restart the process looking for data Association in this setup in this alignment and then get better so if we now start with this initial alignment and look for new data associations we will get a solution which looks like this so you can see here this data Association is not perfect but it's clearly better than what we had initially so it's better than this one this data session over here and so we take this data Association and again perform exactly the same step of computing the alignment of the two point dots and we iterate this process over and over again that in the end we come up with and hopefully perfect alignment of the point loads and this is kind of the key idea of the ICP algorithm so we know how to align to point out give me the data Association given that we don't know the correct data Association we guess it based on an initial guess and then perform an alignment assuming the data station would be correct which gives me two new point sets and then I simply repeat the process assuming this alignment should be better than I want to hit what I had initially so I recompute my data Association then compute my translation parameters and iterate this process over and over again so that after a while those two point sets are aligned as close as possible and this is the idea of the iterative closest point that's also what gave it its name it's iteratively making a closest point data Association and then use the SVD approach in order to compute the alignment and the hope is if the initial configuration so the initial guess of the data station is good enough so if the point sets are close enough we will actually converge to the right solution but there is no guarantee for that there's no guarantee that this is actually the right solution and we just can have the hope that we are close enough to get the right solution so it's not guaranteed that we get the optimal result but it's kind of an approach where we guess a date association and perform it iteratively and hope we get better so if you put this into an algorithm would maybe look like something like this we start with an initialization of our error value and we say while the error decreases or has decreased in the last iteration and the error is still larger than a certain threshold then we are actually continuing our iterative process so what we do we determine our corresponding points this can be for example taking the closest point in the other points that's given the initial guess this is standard approach that we can do then we compute the rotation and the translation vector using this SVD based approach that I just explained before and then we apply there this rotation and translation to register or line those point clouds and then we can compute an error value which is basically this error function that we're trying to minimize so what is the sum of the squared errors between the corresponding points and so I will start with an high error value it's infinity in the beginning and then after the first alignment this error will be smaller and then I enter the by loop again say yes the error has decreased because from infinity to something smaller and assuming the error is still odd in a certain threshold I again look for new data stations given my my better solution that I have compute rotation and translation matrix apply them compute the error again and I continue this process until the error doesn't decrease anymore or the errors smaller than a certain threshold and you can also check if the so if the data Association doesn't change anymore so this data issue stays the same into the consecutive steps the error will not decrease because we will get the same solution so as soon as the data Association doesn't change anymore we are actually done with our problem so just a small example how this looks like so what you see here are two scans shown in different colors this is actually a top-down view these are kind of two buildings here this is some buttresses on a sidewalk and you can see how the approach iteratively aligns those two scans taking in this case from a 2d laser range scanner so it's a top-down view they kind of bluish things and the yellowish things are the two different scans and this is kind of how the iterative process point approach works you should ever note that there is a large set of variance of ICP because the variance mainly come from the fact is how these these response look like and how do i compute my error function so the error function now was a point-to-point error function what we have shown so we're basically computing the distances between two corresponding points they are alternatives to this that I can for example say assuming that we have surfaces what's the difference the distance between a point towards a plane assuming that the the surface is a plane that one point this is to a plane especially for sparsely sample point thoughts this becomes relevant because we don't want to introduce an artificial error just because there was a low sampling density on one of those surfaces in those examples it may be better to create the distance from a point towards a plane and again there are a large number of different variations what I can change so one of the things is for example I may not want to use all points why is this a good idea to reduce the number of points the more measurement better it's going to be the initial intuition but this can have several reasons why I may want to reduce the number of points at I'm considering this can be two speeds but it can also be two differently different densities so I think about the typical arrays there's a range scanner scanning point poison suddenly applied to give high up it's been point and I may want to give surfaces of money alone is not a higher weight just quickly on those points as maybe reasons which I'll consider only subset of party the other thing I can take into account is waiting before policy couple measured points or maybe I may want to support this for example waiting for hire I'm very certain about the data cessation or the quality of the three points and maybe give a slower lower wait two points where I'm less certain about it again different ways and how to do the data cessation like this point of point or point the plane are several examples two examples for this and I may want to also take outlier detection techniques into account so in practice we will make data cessation mistakes even if we have a very sophisticated data Association technique there will be errors in our data Association this can be due to the case that the environment actually changes so consider your taking away the range can and persons are walking by and walking through your skin so if you take the two scans at two different points in time the person has moved and therefore corresponds to the same physical object but the physical object has moved and then you will introduce errors in your in your your skin if you take them all into account assuming that what the points would be steady but there can also be other reasons glutens can also be an issue so we are likely to make that association mistakes and the questions how do we reject those outliers or gross errors that we may find in our skins and so there are different variants of the ICP algorithm and I want to discuss a few of them now in a little bit more detail at least giving idea on what's going on here so again there can be different reasons why I want to perform take into account those variants as I said before speed if I reduce the number of points I'm considering my approaches obviously faster this becomes important for real-world application where speed matters for example if I have a 3d lidar scanner which takes a large number of points ten times twenty times per second I may need to reduce the number I'm considering in this alignment procedure stability local minima may be an issue or that I give too much weight too low certain local structures because just they have a higher sampling density based on the principle of my laser scanner this can be an issue maybe some information I have with spec to noise or outliers if I for example can estimate that some objects look like personal pedestrian I may want to take them out of the registration procedure if I'm moving through an urban environment because these are objects which are expected to move or most likely moving and so I can take this additional information into account also for treating certain objects as potential outliers or we may take steps for the increasing the basin of converging so what's the maximum initial misalignment between the skin so how bad can our initial gas actually be so that's the approach still converges to the right solution so let's have a look let's start with subsampling points so different techniques I can use it can of course use all points this is kind of the standard approach of all points for the jak course button C's should be these the corrector me yeah I can perform a uniform subsampling so that I have for example a uniform distribution of points over the space this would be and one of the approaches but don't want to give a higher weight to certain structures just because they happen closer to the sensor I can do a random stop sampling which is basically easier to implement so you may want to go for that we can do something based on certain types of features let's say a certain object which are more important than others for my alignment and of course I can take that information account or what's often done is a normal space sample that I basically have a higher density of points where the normal structure so we have surfaces with a high curvature of a strong curvature because that could be better suited for the alignment make sure simple more points than those witches and compared to this a very flat boxes there's one example over here if I have two skins and they look like this and a uniform sub tempering may generate data stations which look like this in normal space subsampling you can see that the density of the dáil association points are higher in areas or stronger change normals in areas where there's no change in the novels the amount of data stations is smaller and this can be used to get in better alignment especially if you have large smooth surfaces so there are two examples over here of very flat surface but you can only see those small carvings in here and again rad is scanned number one and green is scan number two but you can see in here is all the alignment doesn't look too bad but you can see here in those areas that actually be the carvings on a perfectly lined and if I take this normal space sampling into account so that I'm actually considering more points along those carved lines I actually get a better alignment of the two skins so this is kind of one example that I can do it can actually zoom in to those two views so I kind of have the zoomed in view you can see here the misalignment of the curve compared to the proper alignment if I take the normal space simply and that's kind of one example how I can improve my data Association by using this additional normal information because areas of strong curvature typically allow me to align my skins better the other idea is to use some feature information for sampling points on features so this is one example of here a 3d scan again this is actually the example that we have shown before you may remember this view initially so this was 3d scan taken in a local environment action of the Fryeburg campus of the science department and the idea here that the work which goes to has been done by a perfect father where he was taking into account features or extracting certain features and then trying to align only those points where the features are actually similar so this can be for example polls or certain structures this can be flat walls so they basically only look for correspondence ease of walls against walls or polls against polls or whatever ground surface against ground surface so you can actually do this to increase the likelihood of making the right data Association and using this as an initial step to then allow your scans in a better way this has certain advantages typically it is faster because the number of points for which those features exist is much much smaller also you decrease the probability of making a data Association because you're not looking for one point in the whole other point that for correspondences but owning the subset where you have a similar feature value of course it requires what makes the assumption that you can determine your future values or semantic classes or whatever you're using and a high accuracy it's not the case you will obviously fail or even make the results worth but this in this example allowed to traumatically reduce the processing time by just considering a much smaller number of points at least for all the initial steps of the ICP algorithm there's another example how scans can be aligned as an example of Andrea's Mista scanning through a cave or underground environment and then you always see kind of a new scan in blue comes in and this point thought is registered against the point cloud built so far in this way you can incrementally map the environment and this approach uses a uniform sampling again to avoid that we give it to high weight for areas which are nearby the scanner rather than those which are further away okay the second thing I want to look into the ICP variants is kind of weighting of correspondences so what we may want to do is we want to give a higher way to correspondences which we let me know have more information about the points for example for points where we can estimate their 3d location more accurately this can for example be the case when you have a stereo camera and a stereo camera you for example ever set up at the points which are further away from the camera are having higher uncertainty in respect to their positions especially with respect to the depth information and that component and so you may never wouldn't give them a smaller weight of their further way because you simply can't determine their X Y that position as as you can do it for points which are closely so you want to give them a lower weight so again this weight can be taken to account based on my sensing principle that I know that certain points are skin but in high accuracy or small accuracy same holds for example for depth cameras such as the Kinect camera or similar setup where you can see the uncertainty of the points increasing with distance or the noise that you have so you want to take that into account this effect this is less prominent for laser range scanners typically or you may have some other information that you can take into account about your points so that you can assume them to be outliers and then actually down weight those points so also some robust estimation techniques can for example be used which then can be turned into a weighted minimization problem which is actually pretty rated to this rewedding what we can do but typically it requires you to have some additional information about your sensor or uncertainties of those points for example aviable again and covariance matrix for every point in 3d that's one way to take that into account data Association is another a key thing I want to look into because this is where a lot of approaches actually differ so how do we do the data decision how do we estimate given an initial alignment of the skin so an initial transformation between the recording poses or between the point clouds how do we actually get this initial data station this correspondence set C which I haven't really talked about a lot so the absolute Senate approach was was called the closest point this also which gave the ICP the name iterative so by taking for every point in the first scan I'm simply taking the point which is closest in for example you feed in space in the other point cloud and make this data Association based on the closest point but there's a large number of other techniques that I may want to take into account like point apply measuring is very popular recently or especially through the Kinect like style sensors but also follow the range scanners we find projection based approaches where models are projected into the observation space and then try to so there's a large set of different techniques that one may use for doing this so let's start with trying to find the closest points we can do this it basically means that for every point in skin number one we need to find the closest point in skin number two and this is something you can do for example and by using KD trees if you have large point clouds in order to make nearest-neighbor queries efficient rather than checking a brute force fashion for the data sensation so for example we have this point over here and then the closest point on the red scan sits over here because any other point on the red surface would be further away from this point so this is the closest point for that point given so this is one data Association and we can of course repeat this process and make make it a decision based on this this is generally okay approach it's actually not too bad it's not the best thing you can do but it requires typically a large number of iterations to be executed in order to perform well and it also requires and kind of a good initial guess so for example if you take those skins and rotate them by one or eighty degrees you can or quite often generate situations where the approach is not actually working that well the next thing you can do is something like norway's called normal shooting so you take a point here and then you compute the normal based on kind of the neighboring points and then sued into the direction of the normal this kind of depends in which way the sensors have been recorded so that this normal shooting approach can actually be an attractive way so you basically shoot the normal up to this point here and say okay this point here would now be my corresponding point and then this is your data Association so it's typically provides a slightly better convergence result then for the closest point algorithm if I have smooth structures or smooth surfaces and a good initial alignment of the surfaces but tippity is not that good especially for noisy data points you can imagine this if you have a high noise here in your data points you may not be able to actually compute your the direction of the normal in an appropriate way and this can then lead to lead to problems like that you have you can also do other things you can compute some it's a compatibility measure between points so if you have additional information like color information war from a laser scanner let's say the reflectivity value or if you have an RGB D camera you also have the rdp value than the color value for every individual point of course it may make sense to only match let's say red points against red points and green points against green points because this reduces the probability of actually making a data Association so if you have additional information like color information or maybe you can even extract that from the local surface properties like normals and you can actually use this information in order to avoid making data associations we can't fully avoid it but you can actually reduce the risk of making that Association a very popular approach is also the point to play narrow metric where you basically minimize the the distance not from point to point but from points to a plane so why is this relevant if you think about two surfaces which are kind of shown over here especially if these surfaces are not very this densely sampled with data points so via a laser scanner which has let's say a low resolution for example you may want to find you may find correspondences between the points but you actually ignore the fact that these are actually surfaces so if you consider one point set as a surface let's say this surface over here then the distance between this point wouldn't be going to this point but actually would go to that surface and here as well that would be the distance you actually minimize in here it may even fit roughly at least so by taking to account that there's actually a surface you can actually find better data associations and you don't do basically a residual mistake just because the sampling density of your scanner is low because even if you reduce those points with respect to each other even at this point in the end low lie on that sir face you still have residual errors to kind of pull into inter directions along the surface a little bit and the route a random process just because your points that hasn't been scanned very densely therefore this point to plane approaches are very popular and have actually been used quite effectively so they are different views of them so some of them say they are actually slower or they actually slow to compute but provide better convergence rates than the point-to-point approach and today we can see that the point to plane metrics actually probably one of the most frequently used measures a last important aspect to consider is the rejector steps of how do we reject outliers in real-world data your furry life what one can do for example is or the simplest approach would be if I have a point-to-point metric for example so simply ignore points which are larger than a certain distance let's say we have here some data cessations and here other data Association by saying this those actually really large values those are small values they're more likely to be correct and then those and I'm simply ignoring the red ones but it's Ken Berg this can help you but it's actually hard to actually find that threshold because it depends how misaligned your scans initially are and putting in hard threshold is always something which is tricky to do or it's a good heuristic in some situations but actually depends how your data has been collected and it kind of advantageous but also can have disadvantages so again select rejecting a point that is further away that's further than a certain threshold is actually yeah something which is frequently done but not necessarily the breadth the best approach that you can actually do the other thing one may take into account is looking at to compare compatibility so do I have certain data cessations where points which are nearby on the same surface are actually far away in the other skin so it's kind of a local compatibility measure and this is typically again more computationally intensive to compute but it can actually help you to identify wrong data associations there are other techniques such as so-called trimmed ICP where we basically compute all our correspondences with for the correspondence ease and then throw away a certain percentage I'd say that 20% of the worst data station so those with the highest initial error you just ignore them that's kind of a standard approach that one can use then this T is somehow related to your outlier issues or this T percent I have over here that potentially I'm throwing away but again it requires you to know something about your overlap and your initial data station not to be too bad so again this is always a very heuristic approach it's not always easy to do that in the right way we will later on in the course learn a technique where we can actually use something let's say at least slightly smarter or different techniques to deal with those outliers so there are different approaches one approach is a Rancic based approach it's a sampling based approach computationally pretty expensive but I can deal with high outlier ratios other techniques are approaches that kind of changed the arrow function we actually minimizing we don't go for squared error function but take another so called kernel function into account for which we squeeze our arrow function so that high errors do not have a strong such a strong impact on the result and so that different so-called robust optimization techniques that I may take into account in order to how to do that so kind of a CICP algorithm in some haha what's how we do this so initially we potentially don't subsample our points to a smaller point set then we are going to determine our corresponding points based on this some sample point set we may wait some of those points if additional information we may want to reject a few pairs correspondences which are very large then we go back to the original approach compute the rotation and translation matrix and vector using the SVD is singular value decomposition and apply this transformation to rotate one point cloud on to the other point cloud we can then compute our error function and then make a decision is this error has this been decreasing or is it still larger than a certain threshold then actually repeat this process typically I restart here by determining the correspondences are typically don't do the subsampling again because it would change my correspondent said completely and this was would lead to a rather in stable behavior so I'm basically repeating here from determining the new correspond C's until I obtained my file so if I use this in practice hawaii's is kind of important I can actually use this to incrementally align scans for example came from a moving vehicle let's say I have a robot which drives around so we have some odometry information basically counting the movement of the wheels and then it takes the 3d scan then it moves a bit further it takes another 3d scan so this 3d scan turns into a 3d point cloud at a point in time I then the system is moving and to believe this movement is very but gives me an initial guess and then I can perform an ICP between those two point cards I and I plus one and align those point thoughts and by aligning those point clouds you who the odometry as an initial guess I get an improved transformation which tells me how did the robot actually move from here to here and then I can actually stack that together and build a map of the environment and this works well for laser range scans that are measuring a point cloud or range cameras or GBD cameras but also I can take a stereo camera or other data that provides me 3d information which I can use here just a few examples here what you see here in that system is a robot that we have built to explore Roman catacombs within a European project and it also used a mapping system taking into account the data from a Kinect style sense of actually from three different Kinect style sensors he installed on the front of the platform preceding the environment and registering them with the help of the ICP algorithm into a consistent set of point cloud and then running a slant algorithm so here on the side you can see for example part of a scan which has been taken from multiple of those Kinect scans the approach here which was used was so-called n ICP a variant of the ICP algorithm developed by a dog Rosetti in his team from Rome and that specially takes into account a normal information to improve the registration make the registration more efficient just as one of those examples I also provide you with a small example of how they could work an autonomous car and so what you see here is data set called so-called key data set where there's a vehicle driving around which has a 3d lighter in this case of a Lada in 64 beam laser installed on the vehicle and what this approach does it basically resistors the 3d point clouds that are coming one again to each other and builds a graph like structure which is then used in slime algorithm to actually estimate where the vehicles going and so far we are still driving open-loop there has been no loop closure or similar information so all the post Corrections here are basically done using the ICP algorithm you can see you know the car drives actually over a bridge over that road we will drive around this is all basically ICP registrations it's a work developed by yens belie who use a surface or surface elements to match against the the current point cloud and uses a projection technique to project the world into a very local view range image of the skin and end of the alignment using the range scanner and now you can see the robot re-enters a known part of the terrain and then you will see the slam system kicking in and doing some corrections you saw the map shaking a little bit and this was an optimization and due to the post graph something we will look into later in that course but the most of the registration you can actually see here was done using the ICP algorithm so you can build consistent Maps over extended periods of time using this technique and it's a key ingredient for a lot of those slam systems which are actually out there so this brings me to the end of the lecture today so alignment of 2d and 3d data point cloud data data about our world or the geometry of the world is an important task in perception and aligning them is needed if you have multiple scans or observations taken from different views about the same environment from different locations at different point in time and you want to register those observations this allows you to build first a consistent point cloud of the scene but also allows you to estimate where the camera where the sensor has been when taking this when you are taking the observations and the ICP algorithm is kind of the gold standard approach today for doing those computations for calculating the transformation between two scans in terms of translation and rotation again we have been looking here into data to be coming from from a lidar or from a depth camera so we ignored the scale information which can play an important role if you look into cameras where our scale information may be missing or that's something we haven't cause have not considered here so what the basic approach assumes that we had an idea about the data station so it requires an initial guess of the data station and given the data Association this SVD based approach is a very effective direct solution for actually computing the transformation so for computing the rotation matrix and the translation vector that will align those two poink lots assuming the correct data cessation and this is again the major problem that we face today with the ICP algorithm that this data Association is not given we don't know the correct data Association we need to guess it we typically guess it based on the initial pose information that we have for example form odometry or from a GPS sensor or some other information and if we don't have any idea about where we are we basically have to make a brute-force search which is computationally very expensive and has also the risk of making wrong alignment especially if you have perceptual aliasing that means scenes where which would look very similar so what the ICP approach does it basically employs an iterative approach by computing a data Association with one of the techniques that I described given the first estimate of where the platform is then performs an alignment minimizing this error function giving a new translation and rotation information so I've aligned those scans and then I again check okay given this improved information what other data station look like and I'm basically iterating forth and back data Association optimization data Association optimization and do this in iterative steps until I basically don't do any update to my rotation translation parameters anymore and then I'm assumed to be converged several variants of the ICP exists like point two points point two plane projection based techniques different outlier reduction techniques all the things that we have discussed and there's a full zoo of different approaches and also a few good overview papers which tell you individual situations may use what or what has been in all of practical examples shown to actually work quite well in practice we have to say ICP does not always converge to the correct solution that's the main problem but the main reason for this is mistakes in the data Association or the scene has changed or different things which all can be subsidized or described by the mistake in data Association so if we have the right data station everything is good if we don't have the right edge Association it's really there's no guarantees that we actually do a good job in most situations I see people surprisingly well but if you have to align hundreds of thousands of scans the we'll be situations in which you make errors and you need to take those errors into account in your slam system for example that you have ways for dealing with those outliers there are different techniques that we will investigate in the remainder of the course how we can deal with outliers again one idea is Rancic sampling based approach or optimizing not for squared errors but using some robust minimizations or robust kernels you know to perform better here but this is we are done for today and usually have an idea on how to align 2d or 3d data taken from the real world from different look at different locations but the same scene and how we can inter line them that means generating one point consistent point cloud at in the same important time getting an idea about the relative transformation of the scanner locations when taking this observation and these information is also something that we sometimes called virtual observation in the context of post graph slam and which will be exploited in the next lectures for this thank you very much for watching this video
Info
Channel: Cyrill Stachniss
Views: 19,904
Rating: undefined out of 5
Keywords: robotics, photogrammetry
Id: djnd502836w
Channel Id: undefined
Length: 51min 43sec (3103 seconds)
Published: Fri Apr 03 2020
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.