MSR Course - 07 Particle Filter (Stachniss)

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
and welcome everyone to the course today we are continuing with state estimation problems and with space filtering but now use a variant of the base filter that was introduced last week so in the last lecture nevada introduced the extended Kalman filter and extended Kalman filter based localization that means we were concerned with estimating the position and orientation of a mobile platform given observations and controls using an extended Kalman filter and the key properties of the Kalman filter is that it requires Gaussian distributions so everything is assumed to be Gaussian and all our models are linear that means motion and observation model can be described by linear function in order to that they use it for real-world applications because most of the stuff in the real world is nonlinear you better introduce the extended Kalman filter where you basically perform log-linear or as linearizations of your transition function end of your observation function in order to approximate the the seller and be able to apply the common filter and so the Kalman filter is a very popular state estimation framework which provides good estimates as long as your probability distributions not too far away from a Gaussian distribution and your models are not too far away from a linear model and there's actually dependency between the uncertainty of your process and the nonlinearities involved in your system so you can have somewhat stronger nonlinearities if your uncertainty is very small of yes to met because you for example have a prior precise sensor or a very good motion estimation then you can actually deal with more nonlinear cases even though they violate quite some assumptions in there if you have a larger uncertainties in your state estimates and your distributions then it's better to live or to have models or a setup where the nonlinearities are not too high because the whole uncertainty needs to be propagated through your linearization and largely uncertainty the large array from your linearization point in your current initial guess today we will look into something different we look into a technique which tries to kind of avoid the problem of restricting ourselves to a certain type of distributions so we want to relax the assumption that everything is a Gaussian distribution for example and we also want to be a bit more flexible with respect to our model that we're using and for that we use also based filter but an alternative technique which doesn't use a mean and a covariance matrix to represent the underlying distributions but uses something that we call particles or samples so it's an alternative approach which is typically computationally more costly because it's more complex to represent these probability distributions so it's not like whatever a mean and the matrix as for mean and covariance matrix that we need to represent to perfectly represent our distribution we use approximations or sample based approximations of these distributions and then we'll see how do we need to manipulate those distributions in order to achieve kind of a filtering performance and obtain a base filter okay so having set the context a little bit we can now arrive a bit more into the details so the first thing is the question wanna answer is where am I there's a localization problem similar to what naved has shown we've added it with landmarks last week and I'm looking into grit maps this time so they have two ways you can do this with landmarks can do it with grit maps and especially for the common filter people people typically use landmarks because the slightly more natural representation if you think about knob trick you want to see in the in the world and represented with the Gaussian distribution then it's easier to think about this as a point and this is a little more natural to the landmark based set up for us here doesn't really matter so I chose the the occupancy grip map or grip map so we assume we have an underlying map in the environment again as we have done in the first lecture everything which is dark means it's occupied everything which is bright which is free and if you now want to represent let's say that the robot is standing over here you could represent this with the Gaussian distribution in the common filter world so this would be a Gaussian distribution we now want to go for alternative representation by saying yes I know the robot somewhere over here but I don't know exactly where and use samples from them and this could look like this so let's say the blue dot here is the curl estimate or maybe else's meaning the situation and those samples are points spread around in the space and every red dot you can interpret as one hypothesis the row it could be there okay so it's a guess okay I guess it's there you can see this I'm asking every one of you here in the classroom where the robot some horses oh the robot to see you next one says no it's here the third one now it's here and force one no it's here so we collect a number of guesses of hypotheses and these hypotheses are those red dots over here and the hope is that by using a high number of those guesses we will actually get a good representation of the uncertainty of that platform so we can actually see this in from the state estimation point of view we start somewhere and I tell all of you where the platform starts and then you're not allowed to talk to each other anymore and you only get the perception in the motion from the platform and in one of the every one of you should draw a path where the platform is on a piece of paper and then I'm collecting all those papers I stack them on top of each other and then every one of you will be an individual gasps of the trajectory of the platform and by taking all of them into account we can get a distribution where the platform was if these were just two people sitting in the classroom an estimate may not be that good if we have 20 people sitting here it may get better if 200 people sitting here make it even better if you have thousands of people sitting here it should get even better so the more of those samples we have typically the better we can represent these probability distributions and that's kind of the idea of those sample based representations so we have guesses or hypotheses and the set of those hypotheses use to actually describe probability distributions yeah so again the common filter did it with a parametric form so there are no random guesses in there it's just one function describing me my probability distribution is a continuous function and the only two parameters are having there or the mean and the covariance matrix alright and we get it perfectly nicely shaped closed form setup that's something that we get rid of in the particle filter because we don't know how our presentation actually looks like when often things are Gaussian or close to Gaussian in the real world but there are lot of setups where they're non gaussians as well and we don't want to restrict ourselves to a certain distribution and therefore we use these samples as random guesses because it's one of the most flexible ways to do that okay so the flexible way let's say we have our distribution the distribution looks like this and want to represent this distribution with samples so we have guesses and make an even these weight it may even be weighted guesses but I say okay some of you I know are pretty good estimators other ones are not that well in estimating and trust those who have good estimators a bit more it's one way I could introduce the sampling so if you're a very good estimator you will get a higher weight than someone they're sitting in the back so that your estimate counts more than the others it's something that I can do or I can actually relate this what we will see later on based on how well does your estimate actually explains the observations that we get so by we can actually say I tell you platform got a command move a meter forward where is it so everyone most of you will draw somewhere at a meter somewhere maybe the meter to some where the meter 3 so you'll get your estimate of hypotheses and let me simply check how well does this pose that all of you estimate it differently actually estimates or is in line with what we observe and those which did a good job I kind of say ok I'm trust you a bit more increase your weight and those which did a worse jobs and I don't trust you that much I reduce your weight a little bit ok so we can see though can we use the weighting as we will see later on as a way for seeing how consistent is the estimate that we got from our prediction with what we're actually observing okay so now what I can say if I want to use samples to represent this distribution by The Tempest I could actually put also samples into space let's say I'm uniformly putting those samples over my state space more or less uniformly gonna be perfectly drawn if they're uniformly distributed over the state space that basically is intuitively means I can't say where it is I'm uncertain it's more or less everywhere in this area and this interval possible and I don't know which area is better or worse this doesn't really is rep is not really represented by this function right this function has clearly higher value C in the middle rather than here at the side so we can do is okay okay this sample is simply more important of course the function value here is higher let's increase their weight and where the function value small bit decrease the weight so we see this as a kind of 1e representation of all probability distribution if we forget about little thing it say okay here is basically the factory somewhere in this interval function is zero in this interval the function R gets a little bit bigger bigger then there's a solid increase steep increase a very high increase then it drops it drops dramatically and then goes down to zero I can do this just by counting the number of samples which are here in this interval between my two fingers or counting the weight of those samples and it can put a sliding window over this area and then represent users actually represents this probability distribution and that's basically the idea of those particles so we getting slightly closer to the mathematical definition of those what the sample means is that the sample hypotheses these hypotheses have weights and what we're basically doing we are defining small intervals and say what's the probability mass that is concentrated in this interval and then we just basically count the number of samples which lie in this interval or we count the weight Q or some of the weights of the particles in this interval which then gives us the probability mass that is concentrated in this interval and this baby what I do with my sliding-window finger so yeah it's zero zero zero zero zero small value small value slightly bigger because now they are kind of two in here slightly bigger mm-hmm even bigger even bigger bigger bigger bigger bigger going to its maximum and then going down again and this is roughly represents this shape of course it should be clear the more samples I have so the denser I make this thing here the better I'm probably able to estimate this probability distribution I mean you have just drawn whatever eight nine samples for a continuous function of that shape this will be in reality a very bad representation or a very coarse approximation it's a that way but the higher the number of samples that I'm actually generating the better this approximation gets okay okay so now let's investigate a bit further what we actually need to represent a distribution in that forum with those samples so what are you what information do I need to store for every of those weights if I think now about implementing this and want to put this into code what what do I need to represent what are the clouds or the struct looks like that you would use to saw the variables for every sample so what do we need to store for every sample oh wait well that's one thing okay we need to store the size of those bubbles here all right so at least one variable should be weight and this is typically a real number I use double or flow depending on the precision that you're interested in what else do we need to store any years what else do we need to store exactly the post where is it is the tear in state space here here here here otherwise if I just have weights 0.1 0.3 0.7 it doesn't tell me anything because I don't anything about the state space so what wins the store is exactly which pose it is representing or which state it is representing that I'm interested in and its weight and these are the two quantities that I need to store so if I perform localization in the plane this would be X Y and an orientation so three dimensions plus my weight so four dimensions in sum if I go for localization the 3d world this may be XY that your pitch roll plus the weight so then would have seven variables I need to store or if I want to estimate something more complex let's say I want to have a vehicle and also want to estimate the speeds of the vehicle so let's say we have a cab on Drive then we have our XY and orientation we may need to put our steering angle in there and we need to put the velocities in there then I would be 6 plus 1 7 degrees of freedom although I'm just moving the 2d plane right so need to put in the quantities you're interested in estimating into your state space into your state variable okay so in formalizes we say we have a set of them so we have multiple of those samples so the set of samples X are the state hypotheses so this in our case in the plane is XY and orientation and the weight and we have this J over here this is just an index over the samples so capital J is 10 that means we have 10 samples that means we have 10 guesses where those particles are okay so ever now say we are taking for every sample in this sliding-window fashion into account how many samples basically fall into an interval what's their weight that means the probability mass is actually given by the weight of the samples at the point where the sample is so we need have a distribution if you want to represent one single sample of the distribution which says it's zero everywhere and takes a peak value at that state and then goes to zero again and the higher height of that peak is the weight of that sample right well this is kind of what we intuitively described by when I use the sliding window if you formalize this more we can actually write this down in this form so the probability distribution that is represented by my sample set about the state space X consists of let's sort ordered set from here so sum over all samples right every sample has some contribution with the weight vector so every sample use its weights and multiplies this with this function over here and this is a Dirac function so function which is zero everywhere just has a peak in this state where it is infinitively large and then goes down to zero again but the constraint that the integral of this Dirac function is 1 so it's developed probability distribution so we basically concentrate all the probability mass into one state and so if you integrate over the whole state space you get 1 so it's about a probability distribution and you can make the interval as small as possible around let's state and we'll always have the probability mass 1 you can see this Gaussian distribution which you take and squeeze the variance make it smaller smaller and smaller and smaller and smaller until it approaches zero that's basically a Dirac distribution intuitively okay so we have a sum of those Dirac distributions every direct distribution has an integral of 1 so if we multiplied with the weights and make sure those weights are normalized to 1 then this will turn into a valid probability distribution ok so is this idea kind of clear or should I repeat some of my explanations again because it's kind of I think crucial for the understanding of this filter anything which requires further explanation specially weighted sum of functions and the functions are Dirac distributions if you have problems getting this Dirac distribution into your head think of them as gaussians with extremely small standard deviation so basically toward zero okay okay so what can we do with this these are just two examples of functions so this is actually a Gaussian distribution this is not a Gaussian distribution and then drawing samples these are now non-weighted samples in representing this space alright so that's good there are two things how I can actually represent the function of samples maybe I should say that before the one thing is with these weighted sample size I have it shown before war with samplers which all have the same weight basically 1 divided by J then all the samples have the same importance but then I need to make sure that the samples are concentrated in areas where the probability probability mass is high so around the mean the density of samples in the state space must be higher than in the outside regions over here right and if I if this distribution of these the states X is different then I can use the weights to actually account for the difference of two ways this here it's now kind of the strong kind of in the easy way in a simple way where we all weights are identical so the only thing you need to look into is the position of those of those small lines every line is one sample these are kind of I don't know 300 of samples or something like this drawn from this distribution and then you get a distribution like this so I can use a Gaussian in this case f of X to represent my function or I can use my samples to represent that function clearly the samples an approximation of this you can see this here in this area no samples have been drawn also in this area or boy you see the probability mass is larger than zero because the integral in this small interval here will be larger than zero so it's only an approximation and also here we have basically two peaks we see it as one peak here and one peak here as samples are distributed you can't really see how densely they are because of the resolution of the plot but you can see that it's kind of more blackish here and here than in outside areas this should be kind of hopefully visible okay so say okay if it's fine we can now maybe if understood or at least have I couldn't maybe I convinced you or at least it's not too magical for you that you can use samples or hypotheses to represent functions as in an approximate approximate manner the main question now is how do we actually get those samples so how do we actually generate samples which are distributed like this so this is the question how can we actually do this and so typically we can draw random numbers uniformly the questions can be used this to somehow generate samples so that they follow a certain distribution so you can imagine that it's possible the KERS at least most of you have used already sampler from Gaussian the function that most of the toolboxes MATLAB Python or whatever you have been using in the past provided for you there ways we can draw samples from the Gaussian distribution we also discussed this two weeks ago when I was teaching last time that an approximate way for doing this for a Gaussian Gosselaar distribution is actually this so you draw twelve random numbers between minus your the variance of the of the Gaussian distribution you want to draw from two plus is sum them up divide by two and this gives you randomly or gaussian close to Gaussian distributed samples which looks similar to those generated here so for a few distributions there are ways that we can do this in an efficient manner all right we could do this for Gaussian we also show that we can draw from triangular distribution is something we looked into at least briefly Illustrated so for a few functions we can actually we actually able to do this but it's actually tricky to do this for general functions so how can we sample for other functions and Gaussian because if we say we just deal with gaussians we could stick with our common filter and don't need to change anything but we want leave the Gaussian world and so the question is how do we generate samples from an arbitrary function and that's actually a question I want to elaborate now with you a little bit more so I consider this is my X this is P of X and I have a function which I know zero here so let's say this is my function my probability distribution bit nasty and I'm gonna generate samples from this distribution and now that's my homework assignment for you you get this function for me and you should generate samples from this function I don't care if you've if what you do is efficient or unofficial just think about this now how could you generate samples from this distribution don't have to implement it now of course but I would like to discuss with you what will be an intuitive way for generating samples for this type of distribution or for this type of function any idea what we could do okay that's one thing so the proposal was to basically fit a Gaussian distribution in those peaks and then sample from a Gaussian distribution that's actually something we could do that's true so let me get a little bit mean and change the function so that approximating it by a Gaussian gets harder maybe you do this because having a burberry function and fitting a lot of gaussian there's also kind of not the most trivial solution necessarily so what's like maybe not maybe maybe it's a good idea if we are kind of close to a Gaussian but let's consider something which looks more like this with flat plateaus where Gaussian or not the best thing to do but nice try good idea and we will actually use something similar as we'll see later on for doing it efficiently but first I want to look into general distributions so how can we do this how would you approach this problem and yet here okay so what do we want you want do we want to have a lot of samples in this area probably not there are a lot of samples in this area oh yes because it yeah I want to have a lot but a little bit less here a little bit less over here so depends on the function value so depends how large countries the higher the function values the more samples we want to generate do we want to generate any samples at all outside here where the function is zero no because if the function here zero means there's zero chance that this that we can up in the state so we never ever want to generate a sample there okay so we only want to generate samples in this area there's a first thing we know where the function is larger than zero everything else which is it's smaller than zero I don't care okay there's kind of first things we only need to generate samples here in this area and it should be more likely to generate samples in area by the function values high and small and areas where the function value is low so what I do now are javi a little bit is drawing a box maybe there's books inspire see a little bit I don't have my function and now what I want to do is all now want to see this as the vision this is a 2d thing right so if one dimension here the other dimension here and I want to generate samples it should just lie in this box and I want to have in areas where I'm where the function value is high I want to write a lot in areas where the function value is low I don't know generate very small number of samples if you don't know how to generate samples to prefer something what you can do is you can randomly generate samples and then throw some of them away okay so what we what could we do to use this trick this box in this idea of generating samples and maybe taking some and rejecting others in order to get samples would represent this distribution in a year yes exactly what we can do is we just now randomly generate dots in this 2d space so without taking care of the country so assume this is randomly generated noise over my space okay so I know generated them and if I will just take basically the X location so what's written in here in this dimension and would write them down I would get a uniform distribution of samples here all right this unit upon distribution doesn't help me because it's really too many samples in areas where the value is small and may be too many and too few in the areas where the state space is high so what can we do now in order to get rid of those which are wrong right say okay I like my samples but I did too many in certain areas especially with the function values small so I want to get rid of those what can I do okay yeah that's what we are basically what we are doing so let's say we look into those samples here in this area okay everywhere everything I barely look to the ratio the distance here and the distance here with respect to each other so what's what's if I if I say this is the maximum I can get off samples I want two samples which is proportional to the height of my function value I think the higher the function value the more samples that won't happen so all the samples which are below the blue curve I keep and all the samples which are the buff the blue curve I throw away so take my sponge my rejection sponge this is my rejection sponge and basically wiping everything away which lies above the blue curve okay the big rejecter here and then I see okay now cool all samples here are gone all examples here are gone and the number of samples I have is much depends on how high the function values are so if I now write it down I would get one you get a lot that's basically it so we have samples concentrated here samples concentrated here and then going down down to zero this exact you are doing this will called rejection sampling rejection sampling is a general form for sampling where I can generate samples from any distribution any function which I can evaluate point wise it's the only thing I need to do I need to draw samples is a random number generator and just take is the number here larger or smaller than the function value so I need to be able to evaluate the function at point and basically I need to be able to draw a box around it so the function is infinitively large that's maybe a little bit tricky for me so there are some constraints or need to bound a little bit in order to do at least in an kind of efficient way the main problem of rejection sampling it's general it's nice to use is that it is extremely slow okay now I show you two functions or draw another function let's say we stick with this one leave your second function then you may realize why it is slow so let's say we have a function and this is basically a Dirac function or our super peak Gaussian okay so let's say this is a function I want to draw samples from so if I now again draw my red box around it and say this is the area where it's it's my own zero so this is not the Dirac because the drag would be very small there but let's say Gaussian is a very small uncertainty now two random samples in this box and I'll take my great rejection sponge go over it maybe you nothing is left except maybe one sample over there the problem is if the area under the curve is small compared to the area I'm sampling in I will reject the majority of the sample that I'm actually generating and this makes rejection sampling is very inefficient in this case here it's not too bad let's say 50 percent of the of the areas the area under the curve so I'm throwing away half of the samples I can probably live with that but and setup like this it's a disaster that's a problem why rejection sampling doesn't really work in practice I mean it works in practice that's not true but it's so inefficient that it renders most algorithms useless at least for in line state estimation if you need to put rejection sampling in there okay not the most general form of doing sampling the good thing however is we are not completely lost we can do other other nice things and one thing which is the basis for most of the algorithms is the so-called important sampling principle so the important sampling principle basically tells me that I if I want to generate samples from a function edge let's say this is my I have my function f and I want to generate samples from F that's my goal and there's something we call the target distribution so target distribution is the distribution where I want to have my samples from the problem is that F is some weird form which I don't know how to sample from but what I know that there's another function it's called a PI which I can define from which I can draw samples for me let's say it's a Gaussian and then we use the formula I had before with those twelve numbers adding them up and then I can draw samples for my calcium this is my function pi which you call the proposal distribution and then the important sampling principle tells me says okay although you want to have samples from the target F I allow you to draw samples from the proposal P draw those symbols and then account for the difference between those two distributions and taking account to taking account the difference between the two distribution is done by computing the ratio between the target divided by the proposal evaluated at the point where I draw the sample okay so let's make an example how that looks like so give them whatever this is my hunch now when the draw samples from my target F let's say okay I can't draw from that thing but I know how to draw from a Gaussian distribution so let's take a Gaussian take a different colors or a thing like this is my Gaussian high and Phi is Gaussian so I know how to get the samples on the Gaussian distribution so if I now draw samples from a Gaussian distribution I would get samples thank you maybe something like this so more samples where the function value aside because if I do draw samples from PI and now the report encephalic principle tells me assign a weight to all of those samples and the weight it's basically the ratio between F and PI so it's F divided by PI at that position so this means if F and PI like here have the same value color so this sample will have a weight of 1 because f divided by PI takes a value of 1 see my weight the weight of that sample f divided by pi is 1 ok let's take another well let's take this one over here maybe here what's the right of this sample so it's this value divided by this value let's say it's maybe one about 2/3 so it's 2/3 approximately because this let's say this is 1 unit 1 2 so 2/3 that's very well and I can do maybe the same here first with the opposite way and this sample over here here the blue one sorry no this was a mistake this was the opposite I was not 2/3 what 3/2 but here and here it would be let's say this is twice as big as this one so here would be approximately half okay so that's the way you go through all samples and give all samples of weight and you're basically saying here I have done too little so I've drawn from the blue but I should have taken more samples into account so I simply say I make the simply sample simply more important and here's that I did too much i drawn twice as many as i should do so simply give all of them have a weight we keep all of them we don't throw any one away but we just same they're half as important at the others they do this for all the samples later on I normalize that all samples sum up to one normalization but then I get samples and so it can be the case that samples are more dense in state space but have a low weight and therefore normalize out that they are more unit more densely populated there you can very think about this very similarly if you think about a uniform distribution if pie is a uniform distribution then you're sampling equally over the state space and then just give weights according to the function value itself and divide it by one so this is because all at the same weight and then you basically take where the function value is high you get high values and functionally loads low values so this would also be a form of important sampling so and if you do this you can example like this you want to draw from the blue from the blue but we only have the red one around so we draw from the red one so we still have the density is the density of the right one but you see that where the difference is big you get high values and where the difference is small you get low values and so this sample set now represents the blue function not the red function anymore that's what the important sampling rate basically says tells us the only thing when you take into account of this precondition down here so it's not allowed that there are any states where F is larger than zero but pi is zero why is this the case if pi is zero in just in case pi is zero it means we'll never ever draw samples there and so if there we have no chance to draw samples there the probability mass for F will always be zero because I can simply generate a sample there this would also lead to the fact in case we would have a sample there that the weight would be infinitely large so it's kind of an infinitely large weight but we will never draw one so the probability mass would be always 0 and therefore this is something we need to exclude so in all states where F is larger than 0 so the target is larger than 0 also the proposal must be larger than 0 if you don't do this you will not simply not represent your function f well let's say this precondition is not fulfilled you can you can still of course use important sampling principle but then you will not perfectly approximate f you will just approximate an approximation of F and basically it means you're approximating F by taking all the areas where pi equals 0 and put it to 0 as well so you think about a Gaussian let's say if if your function f would be a Gaussian and you restrict your sampling to interval basically means you take your Gaussian and outside the interval you set it to 0 if you make it whatever plus minus Six Sigma you could say I'm fine I'm not caring what's outside of Six Sigma bounds but you should need to be aware that it against approximation then so if you want to do if you want to get as precise as possible you need to take this precondition into account and what you also need to do is you need to have a large number of samples so the more samples you have the more precise you get and if the number of samples approaches infinity you can approximate the function as precisely as possible okay so is it clear about this ideas of representing functions with samples and this is how to generate samples and this important sampling principle this is something which is clear yes so given that it's a node will not be the same because you're generating samples at different positions so the result will be different but even if you run it twice with the same distribution will be different because you're drawing different random numbers so it's anyway a randomized function but the more samples you draw the closer things come together because the more samples you draw the close you're approximating your function f so kind of probabilistic Li it's the same thing but of course the actually instances are different where the difference comes in in practice is in practice you only have a small number of samples or you want to reduce your number of samples for efficiency reasons and Emmel reasons so you seldomly use millions or trillions of samples utilities a amp would be nice to have 2,000 or 3,000 of those samples and then of course if for the low numbers it matters how close is your proposal to a real distributions if you have a lot of samples generated in areas in the state space which are irrelevant you have you don't have enough samples to represent the important area as well so in practice you want to have proposal distributions which joke as close as possible to your target became because then your most effective but this basically depends only that the number of samples that you have in practical systems is limited if you would have the supercomputer which has infinitely number infinitely many samples you wouldn't care okay any further questions okay so we're a little bit ahead of time for a break but I think we'll make a short break and then actually we'll dive into the state estimation with a particle filter which actually uses all five years here to perform state estimation so five minute break and then we're going to continue to continue and so the second part of the lecture today I would like to use the ideas of samples for representing probability distribution to actually turn this into a filtering algorithm and so the particle filter as the extended Kalman filter is a variant of the recursive based filter so we have the idea of having an existent belief using our motion to make a prediction about where we are and then taking account the observations for making a correction of what we thought where we are which turns into our new belief so basically these steps that we have seen for the general derivation of the base filter but also for the extended Kalman filter or the Kalman filter will apply here so it's just a different realization of the state estimation principle the main difference to the common filter is that it's basically a nonparametric approach nonparametric means we do not have a probability distribution that we can describe as a as a function in closed form so we don't have parameters in there the samples itself described or the States itself describe the probability distribution right by just our sample that we have it's not some artificial parameters that we need to need to put in there as we will see that also the the model that we're actually generating can have something to do with samples and the proposal distribution that we talked about is something that we will use to define this base filter so as it will turn out the prediction step of the particle filter so the motion prediction will turn into a sampling from a proposal tribution and the correct you step taken to counter observations will actually correspond to the step of the important sampling principle that we do the corrections so basically we'll use the observation to compute this weight and use our motion model as a proposal distribution and then it actually turns out that if we do it in this way we can interpret this as applying the important sampling principle to update our belief and we can at the same time interpret this as a recursive based filter and so in this case if we select kind of the prediction for our prediction our motion model the the from the derivation the in the correction step the observation model comes out and accidentally so to say it turns into a based filter so there's kind of an equivalence or an overlapping error between this import we can do with the important sampling principle and the recursive phase filter and a general thing is the more samples we use the better our estimate gets so we can actually show the number of samples go to infinity we can approximate any arbitrary distribution with this which is a nice property so even if we leave the Gaussian world if you have infinitely many samples the result will be as precise as it can get reality of course we will never have infinitively many samples so we'll always make approximations so it's for clear in reality always an approximation what we're doing or we should also not forget that also the common filter we do approximations I mean we use a linearization of our observation model of our motion model and then said oh we do the approximation then if you forget about it and then we have an optimal estimator everything is great but we forgot that we actually made predict made assumptions or approximations beforehand and here we will always suffer from those approximations because our representation of the belief itself is the approximation okay so the come to the comma first sorry the particle filter basically has three main steps the first one takes it if you think about the important sampling principle is kind of generate samples from a proposal distribution and if you think about the base filter means take into account the motion to predict how the state of the system will evolve this is kind of step number one then there's a step number two which in the important sampling principle world is compute dead weight so take it to account that you want to actually approximate a different function the target then the proposal distribution that you actually have used so we what we want to approximate is or estimate is motion and prediction motion motion and corrections or prediction and correction which is motion and observations you can basically see it if target divided by proposal is prediction times correction or motion times observation and B divided by the proposal which is just motion just the observation remains this kind of the second step and this is kind of the classical correction step in the Bayes filter where we take into account all observations to correct the predictions that we have done and then there's a magical third step which is not motivated by in the Bayes filter world and but which is basically something we need in order to kind of keep our particle set healthy it's something which is called a resampling step I will talk about this a little bit later because it's not basically mathematically motivated from the base filter point of view but we actually need that to be kind of numerically better and therefore it's it's important step that we need to do so in short number one sampling particles from the proposal distribution so we sample the new state of the particles so for every particle we iterate over all J's for every particle at time T I draw a proposal from where I believe the system will be given the previous state probably will go the previous state in there and my odometry command and things like this so my motion model to be sits in here so XT J minus 1 UT is one example how this could look like and then second step is that compute my importance weight where the weight is what I wanted to approximate divided by what I used up here so the choice of the proposal has an impact in other are computed and so I have to do some derivation that will get a result over here this is my correction step and then the center of this resampling step or the resampling step basically does its it draws samples with replacement from my state estimate and turns it into a new particle set and it will explain later on why this is kind of relevant in order to get actually good estimates although it's not fully motivated by the base filter but it basically has to do with the fact that we only have a limited number of samples if it would have infinitively many samples we could actually skip that step okay we can turn this into a particle filter algorithm right written and down and slightly more formal way but it does exactly the same thing so our input is the particle set at time t minus 1 our motion command and our observation then everything which is a bars by amazing my predicted set and this is my output set so initialize them as empty sets I iterate over every sample and say for every sample generate a new pose just estimate where the sample will be at the next point in time according to some proposal that I as a designer can set then compute the weight for this particle by target divided by proposal so the probability distribution that I want to approximate belief divided by the proposal that I've chosen here evaluated at the state X and add that to the new sample set and then this here is the precepting step and that's it so it's it's not very complicated so how's it used in reality so the kind of a 1d example how that could look like a state let's say this is robot which lives in a 1d world this is the belief of all its post and this is the X estimate the platform and reality is here and this shows the state estimate we will see later on how this looks exactly like this by means there's a little bit of probability mass of the system is here or here or here here a bit more here super densely populated so actually most of the samples are located here which corresponds to the location of the platform and then there are some measurements you could maybe also be here but this is much smaller probably so in this case the majority of probability mass whatever 90% will be approximated here and some other hypotheses could be possible in a real world it could look more look like this so this is kind of looks very similar to a Gaussian belief but it's not necessary what you have in practice what you have in practice is more something which looks like this so these are three different states of my estimate what are you first well so what does it mean if you see something like this let's let me not explain it to you let me ask see if you could figure it out what does this mean if I'm in that state exactly the role but has no idea where it is it's basically a uniform distribution or an approximation of a unit is the uniform distribution over the space right no I could be here here here here basically anywhere this is actually something how you typically initialize your filters if you have no idea where the platform is just randomly take all the selves the gap and spread them randomly throughout the state space of course here only spread in areas where we know there's a map assuming that the system is not outside the map okay what does this mean partially knows where it is maybe we can be a bit more precise than that yes so it's more less a multi modal distribution saying I'm either here or I'm here or I'm here or I'm here you know the rest is just a little bit of noise sitting around just very unlikely it's not possible so this is something what happens if the platform drives around for just a short amount of time a few meters and you can see here from this environment this environment consists of corridors so if you don't have a very long range and you're here and driving down you could also be here and driving upwards because the local observations that you would get will be very similar or you're in this corridor and what happened actually here the platform was I think going up and right so up and right so in reality the system has been here but it could also be that the system was coming down here and around this corner or coming from here exactly from here and going right or here and going right looks very similar so taking out of this path or this path will look very similar from your local perceptions or me leave me rotate upside down going down and turning right here this is why you have multiple beliefs and after you have navigating around the next corner down here the belief is actually pretty concentrated around one point and basically how large this particle cloud is if it's very concentrated or not very concentrated basically depends on how precise are your observations and how precise is your motion so in this example it was actually a walking robot so without real so humanoid platform which is very noisy from its motions compared to real platform so the uncertainty has a tendency to be quite a bit higher than if you were using a wheeled platform okay very good so you get kind of a feeling on what this particle distribution is how they look like and how we can actually interpret them and you can also see that it here in these examples it's pretty important to use something which can represent non Gaussian distributions this is clearly non Gaussian you have no idea what the platform is you could still argue we can represent this by a Gaussian centered here with a very very large covariance variance so it's okay could do but those intermediate states where the system actually converges would be very hard to model saying I'm either here or here or here and you will have those intermediate steps if you do basically go up what we call global localization if you start out without knowing where you are okay then let's go to the something which is called Monte Carlo localization so Monte Carlo localization is basically using the particle filter for estimating where the platform is so now our state space is X Y theta or X Y that your control depends on your state space so each particle is a pose hypothesis so what I used in my example kind of the whole time already and what we typically use the standard choice is saying my proposal distribution is the standard probably stick motion model that we have discussed two weeks ago so P of X T given XT minus 1 UT so assuming I know where I was in the past and the motion command I'm executed where will I end up okay so how to go from the past to the future given my motion command this is how I will generate my samples and you can see this intuitively it's actually quite nice because you can say okay in my believe now I have my particle set at time T minus 1 so I assume I know I have my hypothetical potus's where there were t minus 1 I now can simply pick out every sample of my old sample sets ok you think you were actually here good that's XT minus 1 you seem to know this part that seems to know where it is ok now execute UT and see where you are not so you will end up with some uncertainty where you are so this part for every particle you can directly apply this equation by saying this part it is says I am here and then yes if you think you're here great think you're there go ahead and make your estimate and this will give you the proposal step so it's very easy just pick out every sample and apply the motion model to it with fun noise with the motion noise of the motion so it's not like if you think about the base filter equation you have this magic integral in there we will need to integrate about the old states time of the motion model this is very simple in here because we just pick out every sample and propagate every sample forward not caring about the rest that's what makes very easy to implement okay so we have done this we take every sample and propagate every sample forward so if one sample a says I'm here time t minus 1 and the command was going me to forward ok meter forward that's a new pose of this sample sample number 2 was sitting here ok you also go forward one meter girls want meter forward but to all of them we add a little bit of noise because we there's it's noisy process not a perfect distribution so everyone gets a little bit different noise so if you have multiple sample standing here in that location one will end up here one will end up here one will end up here one will end up here one will end up here this is kind of how you spread your uncertainty through that filter and then once we have done all that you say ok now it's time for step number two take the observation model into account what we can do now is we take our target distribution which is kind of the full believe of a platform and divide it by the proposal distribution and if you still have the believe of the platform it roughly in your mind there was also the motion this mean this thing was popping up in there and kind of the the observation model and it turns out that you can actually take away because you have it in the dominant in the up here in the proposal and in the target distribution the motion model pops up and basically goes away and you can reduce it to the term which is something which is proportional not equal but proportional to the observation model so what you basically what it basically means is how likely it is to observe what I actually observed given I no way I wasn't given I know what the rope looks like and again that's simply something which is easy to do if we know the state but weights the state it's every particle says I know where I am and everyone else was wrong so we just again iterate over our samples and sample which ended up en says okay from your point of view here how well does your how likely they said you observed given you view of the world given that you think you're here what you observe what the robot observed in reality and you basically compare what the robot observed and what you would observe here so if I say this pillar 3 meters exactly in front of me but the robot says no it's not 3 meters in front of me it's 4 meters in front of me then this is probably not a good estimate then it's much more likely that we're actually back here so this several will get a low weight and then it was the same thing for all the other samples for this time with Rossi here for the sample which was here for the sample which was here for the server which was here I asked every sample hey given you think you're here given the fact that you think you're here how likely is to observe what you actually observed and this gives me the weight it's just proportion it's not exactly but it's proportional but in the end I don't care because I need to normalize my weights anyhow that they sum up to one so the normalization constant I can safely drop without making every any mistake okay and this is a process which I'm then iterating so if I go to my particle filter algorithm I can use exactly the same particle filter algorithm only the two things which are underlined in red are the things that need to be changed to be made concrete and again the proposal distribution is a user choice I as a user the design of the algorithm can choose my proposal it choose any function that I want given this precondition we discussed before I can choose any function I want but I tell you it's actually a smart choice to take the motion model into account you don't have to but if you do so your life gets easier then you choose the motion model over here and at the result of this after some derivations you can see that if he is this one the motion model as your as your proposal then the weight is proportional to the observation model and then this comes this comes as a result of this choice but these are two only two things that you have to change so you're taking your motion model into account for making the motion update which also makes intuitively sense and then take the observations into account in order to correct and tell how good was that idea to do that motion and if you do this choice then you have the equivalence with the base filter framework okay so let's make an example so given we have a platform this platform is setting over here it's a robot which lives in the 1d world these are the sample space he has no idea where it is and then it has a sensor and the senseless tells I'm in front of the door there's a door sensor just says door if it's in front of the door or says nothing if it doesn't see a door so I'm walking through the environment of an extra door sensor says door door and so on so let's ascender that this platform is so drive around through the environment some point in time for the first time the sensor says door is okay oh where am I now and then you get actually if I don't know anything basically what's the likelihood of this pulse that the sensors has door that's pretty low maybe this is some probability that the sensor is screwed up and just has mistake and just says door although it's not in reality a door it's the same everywhere so all the bricks looks the same so they look the same doors or non doors and then you're in front of a door and it's okay oh probability increases if I'm in front of a door the system is likely to say me do or if I'm just kind of at the beginning of the door the probability may be slightly smaller I'm in the middle of the door should be very confident that I made a door and so on and of course it's a mistake that the robot missed the door because whatever the door was paint in a different color mine sensor doesn't work perfectly whatever that sum is probability for making mistake but this is the P of that given X so given a purse that's the likelihood of observing this so what I now can do is I can okay this was my it's my current belief and I'm correcting it with my observation by saying those samples which I at locations which are have a high likely will get boosted up and the other one will get weighted down so what I do I'm iterating over all sample I multi take this sample multiplied with this weight I take this sample multiplied with this weight I take this sample multiplying with this weight I get this sample multiplying with this weight all this is cool this is a really good one so this will get pumped up and you can see that here in the end of course you need to normalize again there they have they have their equivalent to one and then your probability is higher here and then you can do the next motion account and actually move this thing forward so if the platform continues moving in this direction cause another meter forward all those believes will actually move a meter forward plus a little bit of uncertainty because the motion adds you uncertainty those samples will jiggle around okay so what I'm doing now in this in this representation going from here to here actually two steps it's the resampling step what the resampling step I'll just explain it now in a second otherwise the explanations doesn't don't maybe we don't make too much sense so what the resembling set does it says okay if I have samples which have a very high weight let's replace them by multiple samples at that same location and then all the samples have the same weight again so you basically take areas which have a very high probability in your state space and replicate the samples there and reducing the weight so if you say I there one sample with the weight - I can replace this one sample with weight - with two samples whose weight one this intuitively what the resembling step does it does so by drawing samples from our original sample set with the likelihood of drawing a sample or the probability of drawing a sample is proportional to the likelihood so the higher the weight the likely to drag out that sample and I'm doing this until with replacement and I'm repeating this process until my sample set is filled again so space four thousand samples do the thousand times that's when you sample set what I'm doing with the I'm basically replacing weights by frequencies so a simpler to the high weight well later on have a high frequency will appear more often so what I'm basically doing with this approach is and saying case I have areas in the states which was a very very unlikely why should I waste costly particles costly in terms of because I can only represent a few why should I waste ghostly memory for a state which is very unlikely it's better to ignore some of the unlikely states and take the samples to represent states which are very likely so basically I take unlikely states in my state space I cut them away and take those of the memory that we're used for those symbols to represent memory or samples which are very likely positions again this is something I need to do because I have only a limited number of samples only memory 4,000 samples if I could have an infinitely large computer I don't need to do that but what I would like to avoid that all the samples go to the unlikely areas of the state space and I don't have any memory left to represent the good states so let the reason why all the weights go down from here to here and the platform moves forward so first all the samples will be shrink down unlikely areas will be thinned out and more samples will concentrate in here and now the platform moved again forward so you can see here at this situation I don't know it's a platform here is the platform here as a platform here looks more less the same and I still have the same here a lot of samples concentrated here a lot of samples concentrated here at lot of samples concentrated here okay now where what sends in front of the next door and again my sensor shots door and again take my sample set multiply it where is my weight and then you can see already here they have a few samples assuming this way basically originally there's no those one shown here which have been originally in that location it basically means the first time the robot saw that doors was a mistake in the sensor and now it's actually in front of this door or I'm in those samples and those samples basically say before it was adored was correct and now it's doors again correct so two times the right thing which get really boosted a lot or I'm here those samples which says the first time I was doing correct but I was standing in front of this store and the second time it's a mistake those samples are still densely populated here but their weight gets very small so these are basic different explanation that you can find and then again I perform so now you see you have a lot of samples here's a very black area they're all have very high values so a lot of probability mass is not concentrated here so in the next three sampling step all the other guys you get even have a very high likelihood to actually die and being replaced by putting more samples in the red state space and then you kind of robot moves on that's exactly what happens so most of the probability mass is now concentrated here and you have a few left over states which are populated over the state space and while the platform moves you will actually see that those states will actually die out and there's an example which I can now show here it's one of the very very old videos on particle filter based localization this was done with the SONA platform they can see those sonar readings here and this is actually the true trajectory of the robot basically that's a little bit shifted my join was not perfect so of course go through the door through the wall so it starts somewhere over here turns around and goes in here and this is a video which have been recorded in the old computer science building close to an OP worker because this is where actually those algorithms have been developed more than twenty years ago for the first application or first use of this technique for for localization system there's actually one of the old videos done by dieter fox at that time being here as a student or PhD student doing developer you're working on this Monte Carlo localization okay so what you now see is around the video two or three times you can see what's actually going on this is kind of currently the best particle and just in the beginning all particles are the same it's just basically a random location where the thing is drawn but this will always saw the best particle or the best particle so I'll go jump a few times from the beginning and then converge to one state so I first let it run once you see it and then it will slow down and explain a few things that's very sad okay I have some manual we don't I will go later on and show you some just for the sake of explanation it's a different example that they are the individual frames in there so what do you see here this is a particle set this is the Smithsonian Museum of American history and you see samples the robots operating and you see the sand is distributed all over this space and then you get for the first time an observation so what you see here is basically always the POE the true pulse and you see kind of the end of the sorry the estimated pulse the best pose of the particle and this is kind of how the laser scanner sees the world you can see our guests in here's for example you're probably the positions a little bit off it should be a bit put to the right because then these scans would not go through this obstacle and here this actually seems to fit this obstacle really well so it's not the perfect pose he is probably a person standing so not everything need to match perfectly and then you get an taken observation into account and the observation makes some samples more important than others so this is kind of this waiting step and you may see that this little bit darker right over here it gets better with the bill in the next steps then we form this resampling that is actually performed and the motion update and you can see now that the probability mass focuses at certain locations so there are several positions of the state space where the samples could be just see still very uncertain but you can see it's probability mass accumulates in some state spaces let me get the next measurement and he again is drawn the most likely pose and you can actually see it actually fits already really well the best particle what the world actually looks like so you take this measurement into account you compute the weight update and then you should actually see that things get a little bit darker here and then you're going to perform this resampling step replacing samples with high weight and those with low weight so you can see you can go for it and back you can see that in the areas of high probability of the States pay ups of the state space the things remain but though unlikely states are more likely to die out okay okay then next thing is a motion update so you move everything forward and again the system is high uncertainty so here even multiple leap beliefs kind of run around into the same post then again you get the next measurement over here and then you can actually see or in the weight update that is even get the darker right here in the middle so this is a very likely state in the middle and those samples have lower weights over here and then you perform again the resampling and you can see from the resampling the probability mass concentrates in the likely states and then you kind of move on motion update you get your observation and baby this system has converged to a nice state your weight update resampling and you get more precise then again with a motion update we get our observation and basis system continues in this way okay so let's go back a few steps we've seen a nice video sorry for not being able to show the it's the old video and the last thing i would like to discuss so that we kind of finalized talking about the particle today is this magic resampling step I just intuitively have explained it to you what it does so intuitively it's easily described you take samples from your sample set and you draw them with replacement and the probability of drawing a sample is proportional to the weight of that sample so the sample of the high weight it will be drawn with a higher probability so if you have let's say two samples in a simple example one sample is the weight 0.3 and the other one or 3 0.333 and the other one has a weight of 0.666 that mean one weight has twice as high weight of the other particle and if you draw the sample it's more likely that you draw the sample with the high weight and if I draw two new samples I will do it the first time it's probably a third that I draw the first sample and two third then I draw the second sample I take one out of the sample store it and then look again because I'm doing with replacement do the same thing again draw again a sample and I'm again more likely to do all the one with the higher weight and put it to my new state and then I stopped because I've only space for two samples so it's more likely to get a good one but lose the bad one of course was two samples was not an extreme example but if you have thousands of samples but you're basically doing you're replacing weights so high values with frequency so separate was the high weight is expected to have a high frequency so appear often in my new sample set and you can see this on the one inside it's basically in survival of the fittest principle so those which have a good weight which those symbols which are very good in explaining the observations are the kind of fitter examples and you keep them alive but you basically do this to a white that a lot of samples cover unlikely areas of the state space and something that you need because you only have a limited number of samples so if you would have infinitely many samples this would not be needed okay the question is how do we actually implement this how do we do this in an efficient way you can visualize this by a roulette wheel so you have a roulette wheel and just compared to a real roulette wheel the buckets not have do not have to say where the ball can end up don't have the same size but they are proportional to the size of the weight so if a particle has a high weight with a large bucket if a particle has a small weight it is a very small bucket and then you just draw the right wheel and see where the ball ends up and pick out the sample that's how you can visually see this or if you envision like this you basically draw a random error and just where the error points to this is a sample your picks let's say and we do we want to have whatever ten new samples so I say okay sample number one which one will it be okay party'll number three copy particle number three to your new sample set and then you do this again draw a new sample oh no its sample n take sample and copy it over and then I take a new sample okay let's go for this one and pick this one copy it over and then you continue like this okay so this is one way the naive wait for doing for doing this sampling um it's called standard roulette wheel sampling so what is the complexity so how complex is is to implement this approach if you have J samples how can you implement this in an efficient way so that you can draw those samples efficiently any ideas how can you do this in efficient manner so let's let's narrow down things a little bit if you don't know how to do this so can this be done faster than linear time first question which I should ask ourselves so linear ends or J is my number of samples can do this faster than J operations know exactly because why why is it impossible so there J samples we need to copy them over so we need to touch every sample at least once so if we do a constant time operation for every sample then we would be linear time ok so we are at least at least linear so how efficient is it for one operation so for picking out a pick out one Center so I draw one during the number and to take that sample and copy it somewhere else I need to get that sample and copied somewhere else how expensive is this any idea constant the copying operations constant that's true and it's constant if you know which sample you want to take the question is you first need to figure out where does the error actually point to and this is something which you can't do in linear time because you don't know the size of the buckets only by inspecting all of them this would be linear again which you don't want to have but what I can actually do is you can implement a binary search you can do a binary search by putting those the weights in a tree and look officially in the tree which will be an N the logarithmic operation so this red wheel sampling can be done in jail of J times so you need J for every simply to touch once and for every sample it's locked J because you need to find which bucket you actually end up in so the J log J operation this kind of the naive standard sampling but there's another sampling technique which also can be seen as using a roulette wheel but what it does differently it basically draws all the numbers at the same point in time so see this is putting n balls into your roulette wheel shuffle at once and see where all the balls end up it's slightly different in the sense that you don't take balls let's say you take an arrow you take these arrows in this case one of one two three four five six seven eight samples and you basically arrange them as a star-like those arrows exactly as it's shown here and then you draw the thing of arrows once let it rotate and where the arrows end up you pick those samples so you don't need to draw any x the roulette will just draw the red wheel once so you just draw the whole roulette wheel once it spins and where it stops you take those samples and then the advantage is you just have to go once through your array here and select where those arrows and up so you draw and do this so the good thing is if you do it like this you only have to draw basically one random number which is basically we add us one of those arrows and let's say the first arrow end up and then you can run through the red wheel and pick out the other elements of your bucket and this is something which is called stochastic and resembling we'll also called low variance resembling and the advantages its first faster J operations because I just draw one random number and then I just walked through my array and see where all the other end arrows end up just in one operation they were just have two paths from an array only once but this is kind of not the real main advantage it is an advantage but not the key advantage the good thing with this sampling strategy is that it preserves the sample set if the samples have similar weights so assume the setup what the weights that basically tell us is one sample is better than another sample right several is the higher weight basis all this sample is better than another sample now assume all samples look actually pretty good they all have more or less the same weight if you then perform resampling it basically means you're replacing a particle with another particle although there's no need to do that because they're performing equally well by just by random chance you're killing one of them and replicating another one so you're thinning out the particle sets comes in which is called particle depletion or particle starvation and the good thing with the stochastic Universal resampling is if all the buckets have the same size and you have n buckets and you have any arrows it will exactly replicate the sample set because you basically will randomly choose the sample to start with and then you will just go in equal steps through a sample setting you will pick all the samples again so as a result of this you basically maintain the sample set if your if the samples are very similar or equal and but will have the same effect of if a sample has a very large bucket it will be clearly taken more often and this is kind of the key advantage of this approach that if your samples have similar weights this is clearly the way to go because you don't lose party you don't make a decision and throw away particles based on a weight distribution that is not informative and therefore you should always use this for sampling strategy this is a bit easier to implement but actually not much because you still need to have this binary search tree in there so considering this is actually not more difficult it gives better results and it is faster to do so whatever something you do always still has to gain you versatile resembling never naive simply there's no reason to do that we're getting short on time and I just want to very briefly mention how you actually implement this so what do you have yeah basically typically an array when in every bucket of the array you have a different weight vector and the question how do you implement this in practice and the first the way how we do this is typically we draw a first random number between 1 and divided by J this basically where we start in this busiest starting point and then what you do once you've started you want to pick all those samples in one / J steps where you end up here right so this is what you want to do this is randomly selected and then just go in 1 / J steps through your array and pick up the corresponding sample where this should be achieved and how you would implement this how you can implement an efficient way you don't store the weights in your area but the cumulative weights so this the first bucket is the weight of particle number 1 the second bucket is the weight of polymer 1 plus the weight of particle number 2 the third bucket is the weight of particle 1 2 & 3 until you reach 1 in the end so just a cumulative distribution of weights and then you draw a random number here and then just add always 1 / J to this and see if you're larger or smaller than the cumulative weight if you're if you're smaller you could basically stay in that bucket and take those samples over and over and over again until the sum of the weights that you have taken it exceeds the weight which is stored there and then you just jump to the next bucket you continue like this so basically there's one full loop and an internal while loop which basically looks like this you can actually run through the whole array so don't expect you to understand these four lines immediately right now you need to sit down at least for 5 minutes to see how the indices are are added up but it's a very simple algorithm just have a follow in a small while loop in there just met a little bit wrong with some indices and then you can implement the low variance through sampling going completely through your through your code and selecting the samples that you actually need ok and then I'm coming to the ends ok this was basically after the long video so what we have what I presented here today was the ideas of representing functions and distributions with samples and then implementing a recursive based filter which tells you how to update this sample set so that we can perform recursive state estimation so that we can take into motion propagating those particles forward and taking observations into account in order to correct for those samples and weighing those samples those which did a good job get a higher weight those which have a small job did a bad job get a lower weight and the key step is for the particle filter that you should remember always easy to remember it uses samples to represent the state space so it's not tied to Gaussian distributions or other parametric forms the prediction step is always sampling from a proposal distribution and in most of the cases the proposal distribution is the motion model if the proposal is the motion model then the way it turns into the observation model and it's something we understand and know how to deal with the other ways to do this with more complex proposal distributions but this is kind of a standard choice in here and then you could have rebate those samples and go ahead repeat this process and this resampling step you need because it's only a finite number of samples that you can represent in your computer and so the key thing the art is to design appropriate motion and sensor models as most of the time to do this and if you think about the localization system with the Monte Carlo localization system then this is kind of one of the standard localization system that you find out there which is efficient which is robust and works in most setups so most of the indoor platform that you see if you see some were a robot in a museum driving around or some indoor logistic stuff either they have an external tracking system if they have a lot of them which does the job whatever then it's estimate on the platform and I would say 90% of the cases you will actually see Monte Carlo localization running there internally some still use the Kalman filter have there some features they use or some beacons in the environment but most of the flexible systems actually use a form of Monte Carlo localization today in order to do that in an efficient way and actually with this I'm coming to the end of the lecture today there's also if you want to read more about this the in the progresa crobat expo there's one chapter on Monte Carlo localization which is from my point of view and extremely good explanation of what's going on you need to have motions and observation models that you need to take into account you can also go into the details of the particle Volta basically seeing why does it approximate the the target distribution as per some arbitrarily precise if you increase the number of samples we haven't provided any proofs you find them in there and it's very nice because it's always a very easy read in the beginning just explaining the raw concept and then you can go deep in the subsequent chapters so especially on the particle filter but also a Monte Carlo localization this is what I would recommend if you kind of want to reread what we have discussed here today so with this I'm coming to the end thank you very much for attention and see you next week thank you
Info
Channel: Cyrill Stachniss
Views: 6,872
Rating: 4.9722223 out of 5
Keywords: robotics, photogrammetry
Id: uYIjB93oAUo
Channel Id: undefined
Length: 89min 46sec (5386 seconds)
Published: Fri Apr 03 2020
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.