SLAM-Course - 04 - Extended Kalman Filter (2013/14; Cyrill Stachniss)

Video Statistics and Information

Video

Captions Word Cloud

Reddit Comments

Captions

welcome to the second part of the course we are looking now into one specific implementation of the base filter which is the Kalman filter an extended Kalman filter which are kind of trim variants of the Kalman filter paradigm the common filter is probably the most frequently used base filter it's used in a lot of applications develop 19th round 1950 and it has nice properties for the case that your with Gaussian distributions and for the situation that we have linear models you can actually show that the optimal estimator so there's no better way for estimating and that's kind of very nice property of course in reality nothing is perfectly Gaussian nothing is perfectly linear so the system the comment but there may not be the optimal solution to address the slam problem but that's something we will experience here during this course so as I said we why do we do I already doing that because we want to address a slam problem solving simultaneous localization and mapping estimating the pose of the robot and the map of the environment this is a state estimation problem so let's look in how state estimation works we introduced the base filter as a general framework a few minutes ago so we have the prediction step and we have the correction step and we will now look into the common filter and see how it realize is actually the prediction step and the correction step so the Kalman filter is a beta is a base filter one implementation of the base filter and it's it requires that your models are linear and you distribution a gaussians it really makes this assumption says that's what I assume if this these assumptions are are justified then actually this is asset before the optimal estimator in this case so before we dive through the details of the common filter just a very very short repetition on what a Gaussian distribution is so this is the standard equation for the Gaussian distribution so it's kind of a factor here sitting in front the next potential function here there is the mean estimate which is the kind of the mode of the Goffe distribution expressed by mu and a covariance matrix Sigma over here which tells us the the higher the values in the covariance matrix the higher the uncertainty and so it's actually what sitting here is the inverse of the covariance matrix so this thus shouldn't be zero determinant otherwise wouldn't be invertible which kind of means have an infinite uncertainty then we don't know anything and we can't can't divide by zero so to say in the 1d case let's say to pick a Gaussian distribution in 1d and in 3d kind of this is the mean and you have one two three figma away if you're three the area below this distribution from minus three sigma two three sigma covers 99.9% of the probability mass approximatively and so most of the events actually fall in this area you can press this SS head in 2d binary loops in 3d by an ellipsoid and it's kind of the standard Gaussian distribution as you should have seen that in the past already so if we have a Gaussian distribution which has multiple variables let's say X is two dimensional actually this XA and xB could also be vectors it's themselves and the distribution is Gaussian so it's kind of let's say we have a four dimensions XA and xB are again two two-dimensional vectors that would be a four dimensional vector and we have a four dimensional Gaussian distribution just as an example then the marginal distributions are also Gaussian and the conditionals are also Gaussian so if I know that P of X is a Gaussian distribution then I also know that P of X a is a Gaussian and P of X B itself is a Gaussian distribution and I can compute this distribution by marginalizing out the variable B xB and the same holds for the conditionals so if I know that P of X is a Gaussian then I also know that P of X a given xB is again a Gaussian distribution and the same the other way around P of X P given X a is also normally distributed so whenever I compute a margin distribution or conditional or same holds for convo convolution then the resulting distribution is again a Gaussian it's kind of something important for the Kalman filter so if we manipulate our distributions like conditional you have seen for example for the sensor model and for the for the motion model it's important that these properties stay Gaussian distributions so even if we update our Gaussian distribution with the motion model with an observation model we still need to make sure it's still a Gaussian and therefore these properties are important for us so in the Gaussian case in the regular Gaussian distribution it actually turns out that marginalizing out a verb is very very easy it's not there's nothing bad about it nothing difficult about it so this is P of X it contains two variables XA and xB I have mean and variance matrix covariance matrix and if you can express this by let's say the first M n dimensions correspond to XA the second M dimensions to xB then we can express the mean as so this part is the for the first n dimensions are the mean for a and the second M dimensions the mean for B and if the covariance matrix has this form then the marginal distribution over X a is the integral over the joint probability distribution integrating out xB is again normal that what we said and the mean is just the first dimensions from that mean and the covariance matrix is just this block of the covariance matrix so if I have a high dimensional Gaussian distribution and I want to compute the marginal for a small number of elements I just need to cut out a part of the mean vector and cut out a part of the covariance matrix and that's it I'm done so it's very simple can be done very very efficiently it's great um conditioning is unfortunately not that easy we have exactly the same setup as before if we now want to compute the conditional distribution so P of X a given X B this is from the definition P of X a and X P so the joint divided by P of X B it's again Gaussian distributed but the mean and the covariance matrix are no harder to compute I don't want to go into the derivation where that comes from kind of not nothing you do within five minutes takes a little longer the important thing to know in here is it's actually is this BB part BB over here because this is inverted so that means in order to do that operation I need to invert this part over here if I have a high dimensional Gaussian distribution and I say I want to estimate just a small quantity out of that given I know the rest it's a very costly operation because I need to invert a large part of this matrix and this may come up when estimating whatever the position of the robot given I know where the landmarks are I need to estimate this inverse it's also something which you will find later on in the Kalman filter that you need to invert parts of this matrix this makes it quite expensive and well you can also see here for example if I basically do not know anything about the variable B so we have an extremely high uncertainty and we invert that this term actually goes to basically toward zero and what then an up ways is just the mean is the mean of a that we had before and this part has actually no influence that means if if I if I say P of a P of X a given X P and I basically do not know anything about xB it's actually the same more or less the same than P of X a this is if this guy so if this guy over here is extremely huge this is kind of one divided by extremely huge just close to zero so all this term goes away and I stay with my mean of the variables of X a B having no influence on that the other hand if it has a strong influence the influence is met by all these functions and then there's a substantial change in the mean estimate just kind of to get a little bit an idea what's going on it was just very very brief revisit of the properties of the Gaussian distribution these are kind of the most important properties that are used in the common filter and as I said before the common filter assumes Gaussian distributions will be covered and it assumes to have linear models layer models means that the motion model and the observation model are linear functions and that's the way they are represented so the new state XT is a matrix a times the previous state plus a matrix B times the odometry command and this is just a term which says okay that's noisy kind of it's a random variable adding a noise term because expressing that it is noisy so that means we have two matrices a and B which may change at every point in time T and they allow us to map for giving the previous state and given the actual control/command to our new state and as we can describe this as matrices the sir linear is a linear function in more than one dimension in the same holds for the observation so what this matrix C expresses is how do I obtain my expected observation given I know the state of the system so given I know where the robot isn't what the world looks like I can actually estimate what I should observe this was this function C does here is again it's a matrix so it's a linear mapping between the world state and the observation space yes please I have to specify them this is something my knowledge I need to put into the system in order to implement a common filter so I need to estimate how does the system move so for example and if you have a robot that drives on wheels so we say the curve the new and they say we can only go forward because we're living in linear world the 1d robot so we are the current state we can go forward you can go backwards so no orientation everything is perfectly linear then this matrix a he expresses how does a system the state of the system under the the state of the world changes when nothing is done so if no command is executed so how does it change by itself typically a robot on wheels only drives the wheels are turning otherwise nothing changes so in this case this a would typically net entity matrix that's different if you have an helicopter which is in the area of wind for example even if you don't execute any command it will be a little bit taking away whatever you want to estimate or you have even systems if you don't apply anything it's still it's still driving so let's say you have whatever an object which even if you execute no command it continues to keep its velocity because there's a system under language control of which which enforces that anyone to a high level estimation or the different way so there are a lot of different things you can imagine why a system should change its current state even though one is not applying a command and the second part of it years just says how is a command mapped into a change in state so if I say ok I execute go one meter per second forward for one for one second that's something which is a command and this needs to be represented by B how this changes my state and this actually needs to know what's kind of what the physics of the system like doing we are the wheels what happens if I execute a certain velocity for a certain point in time or turn the motors all the parameters may sit in this matrix B the relevant ones okay any further questions so far yes please the hole that we learn about the Bayesian filter that is these two other calculations how I get to my so these two terms are kind of the motion model and the observation model so these are kind of the the free parameters that you have when you implement your your system you just specify your motion model and you need to specify your observation model what we presented here is kind of the mean for this for the motion and the mean for the observation model so there's two big you have still an uncertainty associated to that because motion is never perfect but it's kind of the mean transformation that are done for for the motion and this is kind of the observation model okay actually yes that's true yeah the equation that we will see is actually pretty hard very soon it's pretty hard to make the connection to what we've seen before there's probability distributions the reason for this is in the gaussian case i only need to manipulate a mean and a variance and i don't need to specify all the gaussians that's much easier if you do it in the way I will present it very very soon but you're completely right so these are free parameters that need to go in our equations that allows us to update our state in a recursive manner any further questions okay so again this matrix a is an N by so set before it's an N by n matrix which tells us how or does the state of the system changes if no command no control is executed okay then we have our matrix B which is an N by L matrix where n is the dimensionality of the state and L is the dimensionality of our odometry command and this describes how the control will UT which is l dimensional changes the state from xt minus 1 to XT again in reality B is often should be nonlinear but have to be linear otherwise the comment that doesn't work so I have to fix that in some way and then we have a matrix C which is in K times n dimensional matrix K is the dimensionality of our observation and n in the dimension area of our state and this describes how we map from a world state to an observation you can see that as what should I expect to observe given the world is in the current state I just have to take the current state XT multiplied some multiplies CT times XT and I get what I should observe expected observation so it's like I'm standing here I know that the wall is 5 meters away so I should measure 5 meters plus not certainty which is associated to my sensor measurements yes please if I don't know anything about the world it's probably a Gaussian with a zero mean and in more less infinite covariance matrix suppose if everything is the same like it and as soon as I start collecting information then kind of this belief becomes more peaked and then I can make better predictions that's kind of the key ingredient of the of the common zones yeah so this this is there as a T here so you can recompute them if you know more so the thing is it yes so the the current state of the system has also an associated to it and covariance matrix so it tells you how certain you are and this is taking into account in as we will see very soon how this is taking to account so this is only done form for the mapping of the mean but we very soon see how this how the covariance matrix also taking into account to take into account the uncertainty that we have but if you'd say if you know we have ten landmarks in the environment but I have no idea where they are one thing I can do set the mean to zero and set them a covariance entries which are more less infinite this means just kind of a uniform distribution where they are distributed in space and the more I observe the environment the more peak there we'll get okay and I have two terms these are random variables representing just the noise in the opposite in the motion and in the observations and they expressed by the covariance matrix R T and Q T RT for the control and Q T for the observation if you have to say a warning depending which book you use or in QL and swapped actually the standard Detroit's the other way around but I kept the notation of the probe listing robotics book where's in this in this notation so whenever you look at different up a different resource don't get confused too may just you may need to change Q and R and their meaning it's just the control noise and the observation noise but don't get confused if they are so it's just kind of their different ways or different standard rotations so to say okay so we said we have our matrices here now to express motion and observation model now let's take them and put them into our Gaussian distribution because we said we want to have Gaussian motion models and Gaussian observation models so how does the motion model exactly this model looks like under under the Gaussian noise assumption with our linear motion so we have a Gaussian this we need to write it down a Gaussian distribution here so given I know I'm in XT minus 1 and given I know I execute UT given my matrix a and B I want to specify the Gaussian distribution which describes this probability distribution I see so we need to have something like P of XT given XT minus 1 UT equals some factor the prefactor of the Gaussian which I I want to specify in detail x2 minus 0.5 and then something comes here what do I need to put in here we've seen everything you need in order to specify that now or you should have at least what's the mean excuse though it's just the variable that I'm well this would be a Gaussian so it's kind of with the Gaussian distribution in xt is mean and variance so this is kind of the variable we're investigating here this is kind of the X and how do we continue what's the mean yeah we can we can do better so consider you know where the system has been before and you know which command has been executed so where should the system end up it in which state sorry so we know a system has been before we know which command has been executed how do we estimate where we should be ok how do i compute my route down please so let's do the full 1 ax t minus 1 minus BT u T and then same stuff over here and wanting to put in here which one not not know why the observations there's a motion form and put the motion noise in here invert it so that means given I know where the system was before and given I know what's executed I can compute for every possible pose you post the likelihood using exactly this equation and this stuff here is the linear model so this is what the comments that are assumed that we have this linear model in here ok so we end up with having exactly this term over here see everything fine yep it's exactly 1 we have okay now you're trained how to do that guy proportional to X 1/2 times 6z t minus C T X T times Q T and here we go it's exactly what we should get again this is the assumption of having a linear observation model now we're coming closer to what you were mentioning this helps us to describe these probability distributions as Gaussian distributions which we then can now plug in to our common filter we can take this put this into this big equation with the integral and get this really really really huge terms so we now have our previous belief which we assumed to be Gaussian it's kind of what we start with in the beginning and then it's always Gaussian we have a motion model which we described and we have our observation model that we described perfect we are done again showing that this is always the Gaussian how the how the mean and variance parameters look like it's something which is really non-trivial so you can see it previously robotics book section three to four you find the details but I just said we know how to specify all this distributions they are all Gaussian so the sum will be Gaussian and the algorithm to compute these Gaussian looks like this it's kind of the key common filter algorithm lines 2 and 3 are the prediction step which is how do we make our prediction 4 to 6 are the the correction step I completely agree that it's hard to see the correspondence between what you see on this slide in our the previous slide it's just that if you write the gaussians in here with all these models that we specified in the blackboard and you turn that in need to compute a new gaussian from that you will exactly get this mean and this covariance estimate in the end if you do your corrections right which follows this algorithm it's just a more compact way of writing it and I will discuss this a little bit in more detail so what we see here this is the the bar over here indicates a predicted mean so this would be the mean and the covariance matrix after we execute our motion okay as we said let's just the previous mean 80 times the previous mean Petey times you so it's exactly our linear model that we described before and if we do that so assuming that let's say assume now I is an identity so not no one externally changes my state then the the new uncertainty is the old uncertainty plus the uncertainty we add through the motion so if I have been certain an uncertainty and execute new motion command increase the uncertainty so this function also tells you that the the emotion always adds uncertainty it makes a system more uncertain because you add the noise term if you have no motion loss you move perfectly it stays the same but that's never happens in reality and this this a why this a is here so it can be used for example for scaling your system if you have a system where the scale grows over time or something like this this goes into a how the state of the system changes without controls and therefore we have this a here can be rotations if the world is rotating you can use this as rotation matrixes to rotate the state okay so that's a prediction step that wasn't too complicated now let's look into the correction step and the correction step if you go back to this equation you can see this is a Gaussian and this is a Gaussian so you multiply two Gaussian distributions if you multiply two Gaussian distributions you again get a Gaussian distribution and and the mean of the new Gaussian is the weighted mean by the means of the individual Gaussian distributions weighted with the uncertainty you have so if you have one Gauss which is very certain one governor is very uncertain the product of both will be a Gaussian which is very close to the certain distribution and if both have the same suppose have exactly the same shape but just the means of different positions you will get the exactly in the middle so this is kind of a product of two Gaussian gives you a new Gaussian where the new mean is a weighted mean and this exactly happens down here this K here is so called the Kalman gain and the common gain trades of how certain am I about the observations and with respect to the to demotion so it looks a little bit complicated this formula over here but I would like to give you at least a short very shrewd intuition that this is a weighing something like weighted mean so one thing that can happen if we have let's say we have the perfect sense or what happens if we have a perfect censor what's QT then yeah QT is basically matrix justice irrelevance beats if I measure something I perfectly know it if we do that what happens KT is the predicted a variance CT let's just drop the time index T to make it C click my bar CT this guy is zero so it's not affected ^ -1 I try and actually rewrite s CT then CT to the power of minus 1 Sigma bar minus 1 C minus 1 okay so that means this guy gives the identity this guy gives the identity so the only thing remain is C to the power of minus 1 if this is the case I can plug it now into the equation number 5 to compute the mean so my mean is the predicted mean plus the Katie the common gain which is now we set that should be C ^ -1 that t minus C T XT right hip correct okay if this is the case let's disable projector in the receipt if this is the case that means the equation goes simplifies to predicted mean plus C to the power 1 the t minus C minus 1 C XT is sick here ah we don't have the XT here this is only the predicted mean so this goes predict it mean so these guys go away so what's written there is mean - mean eliminated it's only C to the power minus 1 x ZT + C ^ -1 is the inverse function which now it doesn't map from the state to the observations but from the observations to the state so this just takes the observation Maps it to the state space is that that's what your new mean is this is perfectly 9 because we said if you have the perfect center we measure once we know what the world looks like so all the prediction we did before is completely erased and only the the the observation remains we can do exactly the same if we have a sender which provides me really no information if we provide me no information this cutie this guy here should be more less infinity so that means this sum turns also into something like infinity and then we have infinity ^ -1 so basically the common gain becomes zero the common gain becomes zero so I mean it's just the predicted mean so the sense information didn't provide me any information so what this filter basically does it computes a weighted mean between the prediction and the observation okay a short example which actually shows that so if this is my prediction step so that's what a system should look like now I get the measurement which is the Green Line I merged both of them and so what I get out is a blue one you can see here the blue one is closer to the green one then to the red one from the mean point of view because I was more certain about my measurement so this is less variance than the red curve and so it's basically really just a weighted mean I can continue this in the next step so this was the previous situation yet before so this is now my my current state estimate I make a prediction so you move forward from whatever a 5 meter or 7 meters to 22 meters so 50 meters forward so then this is my new prediction because I get more uncertain when executing the action then I get a new measurement and then it's a weighted mean between both and here they are more or less the same so the the new means it's somewhere in the middle and then we can continue this process okay so what we what I've shown now is that the common filter weather is under the assumption that we have linear models and that we have Gaussian distributions we have a way for computing the new mean and the new covariance matrix based on the previous one and the command and the observation model and that is basically a weighted mean just ways how certain am I about the observation how certain am I about the motion and then combines this as a weighted mean that's all it is now let's go - lets meet reality the carbon filter assumed Gaussian models linear models and Gaussian distributions questions what happens if this is not the case actually turns out that for most realistic situations this is not the case because as I said in most realistic release robotic scenarios you have an orientation somewhere because the system is looking into some direction or takes a measurement in a certain direction this involves angles as directly leads to sine cosine functions which are nonlinear functions which then lead to problems so what happens if we actually get rid of our linearity assumption say we have some nonlinear function G which tells us how we map from the previous state the motion command to the new one plus some Gaussian noise and the same for the observation so we have nonlinear observation functions which maps from the state to the to the to the observation space what happens if you do that and say we just ignore there the stuff is linear if you do that yeah let's see what happens so this function this plot may be a little bit how to read in the beginning that's actually we like it a lot so what you see down here is a current mean estimate the current estimate of your Gaussian distribution if you transform that through a linear function this is kind of mirroring this function here and this this line so that is what your the resulting function looks like so it's kind of if you take this you take every value here map it through this linear function you get the corresponding value here and if it's linear function cognized expresses in this way so you have a gaussian here you map it through a linear function so the result stays Gaussian that's important for the Kalman filter because we want to do it once one step after the other and everything should maintain Gaussian if something is not Gaussian in more we break look what happens if we are we have a nonlinear function so now this function is a nonlinear function it's not particularly ugly we map our our gaussian solution through this nonlinear function that's what we end up with however good UI as your site is this is not a Gaussian distribution so that's the problem that we have it is definitely not a Gaussian distribution so the problem is if we just apply our nonlinear function it will simply lead to the fact that we won't end up with a Gaussian distribution and then we can't apply our comment further anymore so the non-linearity destroys the gaussians and then it doesn't make sense to compute the mean in the variance if we are so far away from the Gaussian distribution okay what can we do to resolve that so nonlinear functions lead to non Gaussian distributions which we need what can we do to fix that think about a dirty trick how would you fix that it yeah yeah we could just linearize our function just say I ignore that it is nonlinear and just take the best estimate I have and just linearize it around the best estimate that's exactly what we do we do just local linearization it's everything the Kalman filter does the extended Kalman filter does the extended Kalman filter fixes the problem that we have nonlinear functions by linearizing those functions and then do basically is exactly the same what the Kalman filter does okay how do we linearize this function so we have our function which Maps the current from a known state XT minus one executing a command UT it gives me my new state this is nonlinear so now we need to actually linearize this what we do is we evaluate that function at the known best estimate at the moment X mu T's is our current best estimator that we had before and then compute the first derivative the partial derivative of this function G with respect to t minus 1 and then say how how far am i away with my XT minus 1 from the from the best estimator ahead so if I want to if I put a new XT in here actually write this as X without t minus 1 and then this should also be next here so how does this so this is kind of evaluating at the linearization point linearizing around the linear so important how far am i away from the linearization point just kind of first order Taylor expansion we can do exactly the same for the correction step just without a function H here these guys here HT and GT which expresses thing are jacobians so jacobians who knows what a Jacobian is okay so just a very brief revisited Jacobi Jacobian matrix is a non-square matrix it's 3dm by n where we have given we have a vector valued function so the for our function G which has M components what the Jacobian does it computes a matrix with all the partial derivatives so this is the first dimension of the function it derived with respect to the first variable this is again the first dimension of the function derive with respect to the second variable and so on and here you have from function G 1 G 2 G 3 so it's kind of a matrix which contains all the partial derivatives of the individual dimensions of the vector valued function with respect to the bare variables involved and this is kind of a generalization from the 1d derivative they have any 1d space you just have DG by DX so it's kind of the 1d case which you all know and this is just a generalization for the higher dimensional case and if you visualize this you can express with this I'd say this is your function the the green curve over here this gives you a plane just kind of a linear approximation in that point over here or for narrating any other function you have like a parabola over here you can take this linearization point and gives you a kind of a plane in the high dimensional space it's kind of what it kind of looks like so this is kind of from from a corner from a line to a plane in 2d so this is kind of these functions arthas linear functions so there are matrices which are linear functions because affixes for one linearization point and then I get a matrix filled with values some variables involved anymore and then this is a linear function of course this only holds for this one single linear a linearization point for different linearization point I need to recompute it okay so we have that case before we hit our nonlinear function we actually screws it up what we do now is we take our best estimate which is this guy over here and fit a linear function linear we put in our Taylor approximation and then we just map it through this nonlinear function and then we actually get this red curve over here we have again a Gaussian distribution it's kind of the best fix we can do best ad hoc fix that that pops up to our mind yeah we take the mean of our current estimate and for the mean yeah otherwise we would again have a nonlinear function otherwise this function would change and so this is the thing so but if the mean would sit let's say somewhere here we would linearize a function here we would get a completely different linearization so we take the current mean and this is our new linearization point what you can also see in here so if we would compute the mean from mething through this nonlinear function kind of would approximate this function here by a Gaussian we would get the blue one and if we do the linearization here we would get the red one over here and there's a little displacement between them if the Gaussian gets federal down here so the high uncertainty actually the difference between the bread and the blue curve gets bigger if the uncertainty gets smaller down here the difference between the blue and the red curve gets smaller over here so this is observation as we linearize the function through one point if the if the all the relevant values are not too far away from the linearization point it's actually a pretty good estimate the problem is the bigger my values are away from the linearization point the more the worth the approximation actually is okay if you looked at the linearized motion observation model we have exactly the same our what we had before with this 80 plus BT is just that we have our linearized model now put in here in our in our golf simulation so this is XT minus this is evaluated at the linearization point the previous mean - the jacobian times how far am i away from the previous linearization point you can do this for the motion model exactly the same for the observation model where this turns into this the observation function now at the predicted mean - the Jacobian of the observation function in now far am i away from the from the linearization point then I again have linear functions again because these are linear functions and I can apply the standard common filter it can review exactly the same so if you look to the common filter if you move to the extended Kalman filter the only thing which has changed is kind of this function over here it's not my ax plus bu is my nonlinear function the same down here it's this was before C T times XT which is now the nonlinear function and the second thing is ups we redo we have to redo we have to replace the everything the matrix a by G and the matrix e by H it's down here down here so this kind of C by H is kind of linear approximation of what we had before but the only thing which change so you can say the extended Kalman filter is exactly the Kalman filter the only thing it does it took the nonlinear functions linearize this functions and just kind of did all the steps which are needed to fix the problem just by stupid stupid linearization so we just linearize our nonlinear functions and used the approximation of this this linear approximation of the nonlinear function and put it in executed the Kalman filter the only thing is these functions a G and H of course need to be recomputed at every point in time because the linearization point changes so we need to rebuild those matrices at every point in time because as we move on we may have a different linearization point and then of course the the first derivative changes okay and yeah and if we actually if we have a linear case then this linear function doesn't make any doesn't lead to any problem and then this G actually turns into a and this H turns into C and we have exactly the same set so if we have if G and H are linear functions the EKF gives me the same result in the common filter would do okay time to wrap up what have we learned today should have learned understood the Kalman filter in the sense what it means in terms of probability distributions just dirkly what comes from the base filter and through the rules of how to multiply two gaussians we and what condition marginalization if we put all that together we end up with its algorithm we have shown here I haven't derived the algorithm by hand it's kind of a little bit involved but you can do that there's actually course material online where the derivation is in if you really want to dive into the details but I'm not sure how much this actually helps you to understand what the properties are and the most important thing is that it actually is a weighted mean between the observation so between the prediction step in the correction step which comes from the prediction step from the motion model and the correction step from the observation model and just computes a weighted mean out of both the problem of the Kalman filter is it is requires linear functions so we introduced the extended Kalman filter as a fix for the Kalman filter in order to deal with nonlinear functions and the trick is just to linearize a function around the current best estimate which is the mean from the previous point in time and then compute a linearized solution and that actually works well in practice for let's say moderate nonlinearities so if you if you have a completely odd nonlinear function it will screw up quite quickly but if your nonlinear function is not too bad it actually works quite well in practice and together with the fact that if your uncertainty is not too huge if you have very huge uncertainties if just by illustration see in the bigger the variance of this Gaussian the words the approximation gets and so if you have moderate uncertainties and modern nonlinear functions moderately nonlinear then this actually is a system which were actually works quite effectively in terms of complexity so we have to terms which influence adhere K was a dimensionality of our observation and this is to to the it's K to the power of two point four and why these odd number over here is because this results from an inversion because we needed to we have this function over here the dimensionality of this function is the is K we need to invert it in the fastest way to invert a matrix is K to the power of two point whatever 3/8 something if you do it in this in the most stupid way this would be cubic when you look close to cubic and the second term is you need its squared in the number of dimensions because the matrices you need to represent are square matrices the covariance matrix for the state which need to update is n by n so I need to at least manipulate n square values for just updating this matrix let's reason where the complexity comes from so if you have either a huge number of variables you want to estimate this term or you have a very large observation vector this term that may be very costly just to keep that in mind but if you have let's say a small number of dimensions you need to estimate a small number of dimensions in your observation space that's actually pretty efficient okay again link to literature if you don't want to know more about it producing robotics book chapter three goes into the details of the common filter I try to keep the notation here exactly as in the book so you can if you said I this one step I actually missed I didn't understand from the explanation that I gave you may revisit the book or ask me whatever you prefer and they are so soon and instant manipulation the multivariate Gaussian density which tells you how to do conditioning in detail how to do the marginals and takes all of that together just as the by-product derive the Kalman filter in the appendix so if you really want to go into the details of the Kalman filter how to derive it that's a pretty good paper to look in it's also on the website or there's a general kind of easy more easy to read tutorial on the Kalman filter by version bisshop which is also kind of good reading if you want to know more that's it from my side are there any questions okay so then thank you very much and you meet together tomorrow for the exercise so there will be no lecture tomorrow on the tutorial and there's also have new sheets here for the next homework assignments but it's kind of a very short kind of recap of Bayes rule and D things so shouldn't be too dramatic and then see you next week thank you very much

Info

Channel: Cyrill Stachniss

Views: 103,549

Rating: 4.9289942 out of 5

Keywords: robotics

Id: DE6Jn2cB4J4

Channel Id: undefined

Length: 49min 5sec (2945 seconds)

Published: Mon Oct 28 2013