Visual Odometry

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
so I'm delighted to introduce Rob Mahoney as our speaker this morning Rob's from the Australian National University in Canberra and Rob is the originator of one of the core algorithms phenology pilot the longest which is the DCM algorithm for attitude estimation he's one of the fundamental researchers in you know drones from way back so Rob over to you cool um let's just see whether that is that screen shared nicely for everyone okay cool great all right so um thanks trich for the intro let's see if I can get my computer working I um I thought I'd just start because I'm not so much part of this community just say Who I am as twitch just said so um I am a professor at Anu and I lead a distance hearing robotics group here and I've been working in aerial robotics technology now for 25 years 22 years probably since 1998 trying to make a helicopter fly so as trick'd said some of the work that we did back in 2005 or so was pretty fundamental in getting the DCM flying and and sorry pardon I mean is that somebody said something I miss so that but that was sort of a contribution um oh this is sorry that's the volume from this from the video sorry about that and so that's I mean that's a DCM running in in a little video there that people are people having a look on so we did some of the early modelling work on quadrotor vehicles with one of my students Pauline pounce is still still around in academia and we've done a little bit of recent work which might be interesting to people on modeling motor control so being able to model them thrust directly from the motor and so now we're getting getting them feedback direct feedback from the motors the SC's then that sort of work might be interesting to people in particular it's it's quite cool because if you get an updraft or a downdraft or something the motor servo controls the thrust on the vehicle rather than actually just servo controlling the the RPM of the of the rotors through measuring the the electric they're like the electrical power into the motor we did a little bit of work on optic flow based methods switch sort of hasn't really crossed over into the hobby hobby community and we've done a little bit of work on GPS delay with Paul Risborough so some of the some of the way that GPS delay is handled in the AF twos and threes if I understand are based on some of our work and I've got interested recently in visual odometry and that's I guess what what we wanted to talk a little bit about it's a little um computer I've got yet gotta get used to everything everything running so if I press this does that I should be able to start so this is just a demo which I did of some work this is peter van gool and my student he'll talk a little bit later and show a demo and so this is your classic visual odometry problem so you're using a camera in this case you can see it's just on an eye on a phone here and you're looking at an object and it's seeing points which you can see in the top right and from measuring the points and looking at the way they move it's recreating the actual trajectory of the of the robot moving through through space so if you actually think about this from the point of view of scale so here we're quite small and we're quite close to what we're looking at but if you imagine multiplying all of the scales by 10 and that would be very similar to a vehicle flying over some sort of relatively flat sort of service and I guess that's the algorithm that we will at least show to you so what I what I was thinking is worth talking about today and I I don't know I'm not so much a part of this community but it's it's certainly super exciting to be part of these these conversations what I thought is rather than talk in detail about the algorithms I just wanted to give a little bit of a high-level view of of what is visual odometry and what are some of the challenges and why perhaps it's not an obvious case that we can take the research results that are out there and put them on to algae pilot Mixtec things to work and what we might need to do to make a really effective visual odometry system for the open source community and so hopefully I can sort of cover a few of those bits and pieces then show Peter and Brian can show a few few results and and also Jim Redford I saw was here so hopefully I'll invite him up and he's the guy who's done the development work on the realsense t 265 and I know that's one of the one of the systems that's being used pretty heavily in the community so we can ask a bunch of questions to Jim so basically odometry the idea of this ability to infer relative movement of a vehicle from the sensors and visual odometry then is is when you use a vision sensor and the point here is that it's a relative a relative measure right so that the key idea is that you you're measuring something relative to some origin and so this is a little bit different from from having a GPS system that's giving you a world origin or something like that and so this idea of exactly what is the origin that you're measuring with respect to and how do you choose the origin is one of the key one of the key questions that comes up when you try and use this sort of technology and so I'll talk a little bit about that so just so that we're we're all on the same page I thought I'd do some really sort of fundamental definitions to describe what we're doing so so interested in some sort of flying robot which we got over here and I'm going to describe its position in space or its posed by two parts so one one part is a displacement or you know a position and that will be relative as I said to some some origin so in general when you're doing visual odometry you don't get given that origin in advance you just sort of pick this so this is some arbitrary choice okay and so then it's just a displacement in the three axes for that and then an attitude and I don't really mind how the attitude is expressed but the mathematics is easiest is if this is expressed as a direct cosine matrix so this is some sort of cosine matrix or a rotation matrix and so those two things combined are what we talk of as the pose okay and so it's an interesting question actually why a lot of the filters and such like for aerial vehicles are built where you separately filter for position and you separately filter for attitude so you have an attitude filter they might be coupled but really you're thinking about this is two separate separate sort of filters and the reason why in a slam that you are in slam or visual odometry that you really need to treat these as a combined a combined object is because when the robot moves the way that you observe the world moves as a combined function of the two the two variables and so you really need to think about pose rather than separately doing a displacement filter and separately doing an attitude filter when you start doing visual odometry alright so the idea is that you're going to put a camera onto some sort of vehicle and so you've got two sorts of measurements that you're going to be deriving from a camera so one of them is something like a feature so that'll be some point in the environment like the base of a tree or a bronze or a corner of a road or an edge of a building or something that which the actual feature points have some sort of high contrast usually what we call a corner in in a feature so either that or some sort of patch of pixels or something something that's recognizable in the image and I'll show you a picture in the next the next slide about that and so then you also can get other things things like the horizon is another very common thing that people people pull out of images some bearings if you are actually looking at the sky and you can see the Sun then that's pretty easy to pick out and let's not forget that if we've got an eye on you onboard we can start doing things like gravitational estimation we can have attitude filters running just from the IMU and they can be coupled into the into the filter as well it gives us a bit of an idea of what we're doing so this is some works that we I'm doing with my student Peter Bangor and and this is actually from one of triggers disco parrot flying over Spring Valley farm which is one of the ANU field robotics and you can see you can see what the features look like right they they pretty basic sort of tiny little image patches so you can sort of if you look at a spot where there's a bit of an edge here you can see for instance that feature there that I've just circled it's a bit of shadow over a piece of grass so you can imagine there's a sort of a dark corner and a light sort of u-shape around it and that was clearly enough for the image algorithm to match between the previous frame and the present frame so here we only see the latest frame and the actual feature that's identified I believe is actually the red dot and it came from where the yellow line started so what we're seeing here is feature matches across two frames but we're only visualizing on one frame and so you can see one of the characteristics of what you're actually going to see you see an awful lot of stuff on the horizon right edges bright edges like that are very very easy to pick out on the image but they're not much use right that horizon is good five or six o'clock there's a way and it doesn't really matter what the plane does it's it's not too bad as a bearing as a direction but it's not very useful for actually measuring any sort of displacement okay the stuff that's really useful is the stuff that's down here close to the vehicle because it tends to move the most when the actual vehicle moves and so here we're not doing too badly because we've got a few you know quite a few matches reasonably close to the vehicle but you can also see some of the difficulties right so you can see where there's there's false matches over here where the the algorithm has failed to to pick pick the actual translations and we've done quite a bit of post-processing here if I remember Peter can tell me more in a moment but we've sort of cleaned up as much as we can of the false matches you can also see over here we've got a few false matches particularly around the edges of images where you get distortion from the lens so lens distortion so your motion models start to fail all of those sort of things that caused you problems all right the other thing that you can see here is that we've got quite a lot of features here so something over between 100 and 200 features and that's that's a characteristic of what we're trying to do get a density of features that allows you to get sort of a robustness to mismatches and that's something that we'll talk a little bit about okay so each one of those features each one of those measurements of a direction is associated with a point in space and so the way that we typically do that is down here in the corner so you've got some arbitrary frame of reference that you've picked and you represent the position of the robot with respect to that so you've got an attitude in the position of the robot and then you separately represent points in the environment with respect to the same frame of reference and so you're going to collect together all of the points in the environment and and the pose of the robot and try and unify these all into into one sort of framework and use that to to measure the position of the robot changing um so you don't so this is this is what I was saying here so the state of your system so what goes into the state of this filter is opposed right and a set of points these are somehow any environment and they are all measured relative to some arbitrary frame of reference here you've got some point out in the space and then what you actually measure and the point about a camera is you don't actually measure the the actual position of the point so the best you could ever do with an onboard sensor is measure its its body fixed frame position with a camera what all you can do is measure the direction so you can see here in this equation basically this is the the body-fixed frame point and then normalizing the direction you can think of that as being something on on the sphere or on an image the direction and that's actually the measurement that you get and that that means that you lose depth information in the actual measurement it has to be recovered from from triangulation all right so visual odometry and a thing called simultaneous localization and mapping are essentially very very closely related and so the idea is that the simultaneous part whoops sorry about that the simultaneous part of this is the idea is that you're doing both mapping and localization at the same point so if you think about mapping you have an environment that you don't know and you're looking for and finding these features out in the environment right and what we'd like to be able to do is actually build a map so that we know relatives to some some inertial frame wherever I happen to choose my reference frame we know these these measurements these P eyes right now if I know the P eyes and I move my robot to a new position right and I measure the same points from the robot then I can use the matches between the environment points to infer what the translation between the previous robot is the previous robot position in my present robot position and this really is the odometry okay or in this case it's the localization the way it's written there okay so similarly if I knew odometry so if I had you know GPS you know one of Paul's EKF working perfectly then I could use the offset of the motion to triangulate points in the environment and then I could build build a model of my environment right and the difficulty is that these two problems are very closely linked so if you know the map you can derive the robot pose if you know the robot pose you can derive the map but combining the two is a simultaneous problem and that's a chicken and egg problem and so that's where a lot of the complexity of the mathematics seem in visual odometry comes from so I wanted to actually talk about this a little bit more because I think it's actually a really really important thing for ardupilot to understand so if we start with imagining a reference frame and we've represented the robot with respect to that reference frame there's this thing called gauging variance which sort of is the killer that comes from this same problem so there's nothing unique about this reference frame right if you're just doing visual odometry you have no you know you have no god up there that gives you the correct reference frame if you've got a GPS turned on you can you can reference that back in to the northeast down coordinates of the GPS but if you don't then you could make any arbitrary change of reference frame that you particularly wanted to write so you can put that reference frame anywhere anywhere you want to and by transforming the variables in your filter you can do that in on line and change change references however you want to so in particular one of the things that you could do is simply change the reference frame all the time so that it it sits exactly at the at the origin of the robot right so if my reference if my reference is equal to my robot pose right then my robot pose is always zero attitude know the identity attitude and zero displacement and that is perfectly perfectly viable from the formulation of slam there's nothing at all stopping Yuri expressing the coordinates all the time in in that point and what you'll get then is basically the egocentric or body fixed frame representation of the environment okay now the point the point I wanted to make here to understand is that this is part of what a slam solution is so a slam solution is entirely compatible with this representation and yet if you look at this then this piece of piece of information from the point of view of visual odometry is completely useless right it doesn't give you any useful information about where that where the robot is and so this sort of is in a sense part of what really underlies the challenge in in getting these systems running on in open in an open source community right so in addition to running classical slam type algorithms we need to ask the question how can we choose the reference in real time on these sort of area robotic systems so if we're coupled up with them the KF running and if has some reasonable input into it then there's no reason at all why why we can't couple the solution of the EKF back into a vo algorithm so that we never get these resets that that we were talking about yesterday when I think round II was asking that question about how do we couple it of course in order to do that we need to have the the vo code running in real time because this that would be an innovation that goes into the vo code so similar there's some questions around can we use some keyframes and global optimizations I won't talk about that but this some final point I guess is is the one that the situation where you probably can solve this reasonably well is where you get what we call loop closure or critically visibility of the same landmarks so if for instance you're using visual odometry during a takeoff scenario so when you take off and then you loiter over the same environment we continue to see the same the same points then we've got a chance of solving this right basically being able to lock down a fixed reference in a sensible way failure to fix the reference is where you get drift and so when we talk about drifting solutions of visual odometry really what's happening is it's the reference frame that's moving around in space okay and because the reference frame moves you're odometry solution is moving both with the reference frame end and the movement of the robot so that's what means that you you then get information here which isn't so useful as a measure of how far you've moved okay even though it's still compatible with the overall problem um so I I guess I wanted to just say a couple of things about why I also think this problem is is hard if you go into the robotics literature at least what you'll find is that um slam is considered a solved problem it's just that actually applying a slam algorithm that works reliable and aerial vehicles sort of doesn't exist although maybe maybe the t26 5 is actually getting there and I you know we've got Jim along so you can talk a little bit more about that later but the question is why don't sort of academic slam algorithms actually work on on these their vehicles and and it's it's got to do a lot with why academics do work or what what they're measured by so so there's these metrics which are standardly used to evaluate the performance of slam and visual odometry algorithms and they're not actually very good at measuring the robustness of the algorithm with respect to running on on crappy hardware right so academics developing these algorithms are trying to get something that has very very good performance but they're not necessarily trying to optimize the algorithms to be able to run on these on on these sort of open-source platforms and hobby platforms and and that's I guess what I'm interested in trying to do in this in this piece of work working with this community and so some of the things that come up there um and I won't spend long on these slides because I think it's all obvious but clearly we're going to be wanting to use relatively cheap and simple sensors so we're going to get motion blur low dynamic range you get all sorts of things with rolling shutters we don't know exactly when the photo was taken so we've got shutter timing issues we've got bad calibration and probably low frame rates okay so small field of view vision sensors so fisheye lens is a heavy I think this is something where we really need to think about the hard way you know probably really do need large field of view to do reasonably what do you what do you mean by this is our yeah it's because I'm fisheye lenses they need to bend the light a lot and so generally a good quality fisheye lens or a reasonable quality fisheye lens is made of glass and so they do tend to be quite heavy whereas most of the cheap cameras they have the plastic lenses does that make sense I didn't know if you meant conceptually heavy like they were expensive to deal with or if you meant physically heavy no no I meant physically heavy yes kilograms well grams but you know they the the lens on one of these one of these things is typically typically heavier than the camera I think that would be true to say that's that set off the top of my head but I believe it it's always good to believe your own your own statements so noise in pixel locations so features aren't very well defined objects and so exactly how do you how do you talk about a feature and that's that's a really important thing um vehicles move fast and in particular the angular velocity so you've got this real issue in when you do a fast correction of angular angular position or attitude how do you actually track features through that and that that also couples into one of the weaknesses of most most leo algorithm so that's actually the standard failure rate of most video algorithms is in rotation rotation and translation so when you get high rotations that's the normal failure mode when objects move in the environment then of course of course you no longer can you no longer can identify them as constants aerial vehicles have lots of occlusion sort of horizons ground trees and lots of range of dips so you might be flying only a few meters above the ground but you're looking at the horizon which is kilometers away so that that's a problem and you need to put it on a limited processing power so the the big issue is like the one that I mentioned earlier so guardar Association and this is an interesting point in existing algorithms so most classical algorithms that you take off the shelf from the community they use optimization methods to get the performance and so when I say performance they're really looking at getting you know minimum root mean square error of whatever the the measure is that they're doing whether it's the map quality or the trajectory and as a consequence of that they're really optimizing locally around the true solution and these sort of mismatches are very very bad for those algorithms so those algorithms generally operate on some sort of linearization principle and they're operating on some sort of Gaussian noise type principle okay and so bad data associations not Gaussian and in particular they tend to break these optimal type methods and so the classic approach in the literature is to take your data Association matches and run a whole bunch of algorithms to remove all of the data Association errors so you treat the data Association areas not as part of the actual optimization algorithm but as a separate data processing procedure where you remove the bad data and then you run an optimization algorithm after that that works quite well and so the problem there is that a lot of algorithms spend more time they might spend 60 to 80 percent of their time actually removing data data mismatch okay depending on the situation and in particular for us you know we're looking at situations where there might be quite bad sensors and so data mismatch is even more likely to occur again this is just a guesstimate based on some of the work that we've been doing and I guess I'll finish up really super quick with the simulation we did to sort of demonstrate this so this is a very similar very simple little example where we're driving it's a simulation actually but we're driving a mobile robot around in a square so it basically drives along in a square here drives up drives along and drives back down here and it's measuring points so each one of these little pluses and a cross if you look there the plus is the I think the true value and the cross is the best estimate that the filters managed to acquire for it and so what we did was a series of experiments here where we added data Association so all of the measurements are noisy so we add Gaussian noise but we also add data Association Association noise and what you can see here is an example where we're doing a very simple this is a similar principle to the old DTM filter that that I derived about 15 years ago so very very simple first order gradient based observer and this one here is an optimization based Kalman filter and so what you can see is by adding in the data Association essentially the Kalman filter is starting to fail and it's because each one of those data Association errors is actually adding non Gaussian noise so it's adding something outside of the performance of the Kalman and I guess this is just to demonstrate the same result so what we're doing here at the bottom graph is just showing an example where we've added the probability of mislabeling a point right so down here everything's perfect they're still noise in the system but all the data Association is perfect okay up here one in five points is mislabeled okay and what we've done it is we've done an EKF in blue and this simple sort of first-order very basic filter and based on symmetry principles in green and you can see here when things are perfect so this is zero percent data Association the Kalman filter is here on the right and it clearly outperforms what we're trying to do and that and that matches what we expect right so this is the correct mathematics for that particular assumption what we can see here is when we get up to three percent so this is one in 30 data Association errors then you're starting to see the Kalman filter performance degrade and this is because we're feeding into the Kalman filter you know errors that are not associated with the Kalman filter and so this sort of robustness that's associated with this question is something that I really think needs to be tackled in the algorithms that we deploy in this in these sort of scenarios and I guess the other thing is that the particular filter that we're doing is very low very low compute so we can because it doesn't carry around a Kalman gain matrix we can run up to 200 features without imposing too much on the error or the processing times as both memory and processing time scale let's go with the number of features all right so I've taken more time than I was planning so I apologize for that but what I thought we could try and do is I wanted to pass across two Ryan to quickly show you the environment that we're trying to set up so we're trying to set up with trigged assessing the environment and settle and I thought there was an opportunity there to get him to just talk about that and show you what we're trying to do so that we can test some of these things and then Peter is going to talk quickly about the the algorithms that we're doing I've taken too long so I'll get out of the way in a moment and we've also got Jim along I saw so hopefully then we can have an open discussion so Ryan are you there yeah I'm here hello everyone how do I I have to get out of my open screen how do I stop sharing me there I'm out and now I go to zoom and I don't want to join a meeting cancel I want I've just turned off your sharing for you there Rob thank you all right is that showing my screen now yes okay so I'll be pretty quick with this then I'll get more faster to the demo so essentially what we've been working on is trying to recreate a realistic environment so that when we develop our visual odometry algorithms we can kind of integrate them a little bit easier into the kind of algae pilot codebase so I guess it's kind of a two-pronged approach with these simulation environments if we can try and replicate the kind of physical setup that goes into the you know IG pilot talking to the drone and whatnot then we have easy integration and then also there's you know lots of time and risk that's usually associated with field robotics things like hardware failings and you know when you go out and collect the day there it's often takes a lot longer than what you'd expect it so trying to speed up the actual algorithm development is another thing that we yeah which I think we fully realized in the last two days when we were trying to set this up properly so essentially the way that it's pretty simple set up so the ardupilot is talking to a civil court and sending rotor controls to ass him and then ass him over this mad Lincoln C++ class is sending these sentient messages back to 2rg pilot so s him essentially has three different streams the air Lib is responsible for the physics engine it's responsible for the sensor model so you get IMU GPS Barrow and the camera there's also different vehicle models and control models and then the Unreal provides kind of the environments that we will be using and also you know the rendering training packages and things like that and then yeah the MAV link comm is essentially set up to send messages to MAV link devices so I will just jump straight in to the demo let me just just quickly is this is it showing a JSON yes it's showing the the config for a sim this is how we kind of parameterize the simulation so down here this is the kind of ardupilot into AI interface so here we're using a magic op date you can also use an ad you Rover as well and there's also multi vehicle simulations - this is the kind of port for receiving the messages and then this is where we're sending out rotor controls and then on the air sim side we're kind of setting up a forward-facing camera in this in this setup and then you get to control things like the resolution of the camera the field of view exposure speeds things like that and then you can also inject noise as well yes so there was actually a lot of playing around in this kind of setup turning off compression you know having that the View mode no display because there was a kind of problem with the actual frames per second it's it's the SEM API is a little bit slow and so trying to essentially turn off all rendering except for the ones that you need was just necessary getting that set up so let me terminal here okay so I'm just going to run is a showing a terminal yes it seems you have to run great great great so I'm gonna first run the this is a precompiled binary again for a performance boost I'll talk a little bit about environments afterwards so that should that's no no I'm gonna run the sim vehicle here and then okay so and then lastly this is the SM script that essentially pulls the image data from the forward-facing camera so let me just go to that so this is kind of what we're seeing in the landscapes environment is that is that coming up yep what's the snow lots of snow okay great and then also you can see from huge so this is what the console is showing from RG pilot and then also this is what the map is showing as well so everything is all working smoothly now let me just go back to the forward facing camera so this is running at the moment at 14 frames per second and so I'm just going to take off here I'll do this quite slowly because I assume it's going to be is that is that moving someone not yet doesn't seem to have taken off yet said armed yeah the only fire taking off yes never got to take off great alright is it is it lagging or what's it like it's fine it's just displaying it taking off it's probably 30 meters above the ground and we started yeah see the valley I'm just going to put it into a loiter just start moving forward now it looks quite good framerate wise okay cool yeah okay so fast yeah yeah that I think that was definitely a problem that we were facing is the actual frames per second yeah yeah so this is an example spoke kind of box and I've got here it's just a HP Envy so it's definitely room for improvement in that regards but as you can see we're flying along the way that we're collecting these images is we picking up the timestamp that's associated to it and then we can ultimately put this straight into kind of an image algorithm but what we've done for the purposes of this and what Peter will kind of talk about is we can save these images save the timestamp we can also run the IMU the IMU API so I'm also collecting that at the moment and that's running at about three kilohertz but I've kind of put that down to about 300 Hertz just for Peter stuff and and yeah this is pretty much it right like you can fly around you can you can actually also plug in a remote control and and use it that way if you'd like and then you also have a range of different environments so there's the standard blocks environment which everyone gets access to so there is a on the on the I do pilot website there is a setup for that for forgetting the box and Garmin running it's not exactly trivial getting these landscapes and different ones running but I think with the SE maintainer starting to move a bit I think we'll be able to get get more easy access and robbers just linked us to a one of those a paper that was recently released on using this more for drone racing and I think if those binaries are released it'll be easier to get a hold of these to get a hold of these environments and use them and as you can see if I just switch to the map here you can see the copters moving along merrily yes so pretty much limitations at the moment from my computer is just you know the framerate it's not exactly ideal we had it at 30 frames per second on the block's environment but obviously with these new kind of shaders and stuff in this kind of richer environment it drops down a little bit there's also some things with environments and licensing if you want to release the environments to the actual age of pilot community it does say that you can share the environments if you buy them with your team so I guess we can just say that whole team is the ardupilot community hopefully but we'll see see how that works yes so cool that's about it it from me thank you thank you do we um I think we might just zip on and get Peter to show us all do you do solution sure so we can get through to Jim and open questions Peter can you yes yes cool and for you as well I will share screen desktop all right so you can see my slides mm-hmm right so skip right past the first demo in the interest of time I just want to talk a bit about how it all actually fits together right so as Rob talked about we have this vo or V slam out and what it what it does is well from this perspective it takes in a linear velocity and angular velocity and some collection of image features that are being tracked and it spits out a vehicle position and attitude or you know pose and some environment points but to actually use this in a real system the questions you need to answer how are you going to get image features and how are you going to get linear and angular velocity because depending on your vehicle these are really non-trivial questions so image features are relatively straightforward you can use an image feature detector and there's a whole literature out there on how you can do that and all it does is it detects points of interest in the image so this is an image from one of ryan simulations and i've just marked out a bunch of image features that are being used in my algorithm yeah so all you need to do is get an image and what we'd like to do as well because we're doing this continuously is we don't want to detect features that we already have as part of the state so that that's a detail on the implementation of that and then you need to track features I say I'll move this out of the way so you can see in this picture the red dots the red circles are where the features are now and the yellow lines are where they've come from and so to do this you need to have the previous frame the current frame and the features as they were in the previous frame of the video and one of the difficulties in this process is actually determining when you've lost a feature and that can happen in a number of ways first of all of course if a feature goes outside of the frame then you're no longer able to see it so it's gone another possibility though is that the feature moves behind something inside the image which is generally a bit harder to detect and another problem is if features change because of lighting changes or yeah the local the local image around the point changes and that's something that's also more difficult to detect okay so this is this answers in a way the question of how do you get it features and how do you track those features these feature trackers they're a bunch of standard solutions for as well but how do we get philosophies so we built a velocity estimation system which uses the motions of the features so those little yellow lines show how the features have moved between images to estimate the velocity of the vehicle between two frames of the video and if you have measurements of linear and angular velocity you can use those as well but if you only have one or the other you can still use that to improve your estimate of the linear and angular velocity so for example on a quadrotor you typically don't have a very reliable linear velocity but you've got quite a good angular velocity estimate from the IMU so that can help you get a better linear velocity estimate okay and this is just a block diagram of the whole system the as it is currently in my code the key thing to notice is that ultimately we don't use any linear angular velocity estimates of moment we're just taking in an image and eventually spitting out a vehicle pose and environment pots and to drive that home or shelf demo so video hopefully everyone can see the video can someone confirm that they can see this yep can see it fine great yeah so in the top left you've got the actual video stream from the simulated environment from Ryan with the red points and the yellow lines being what's being tracked you've got the estimates of the environment points and the robot pose as well as its trajectory in the main screen so at the moment I haven't added in ground truth trajectory into this visualization but I hope that for the most part it's easy to see that the tracking matches reasonably well what kind of motion you expect in the window on the top left now you don't have scale correct yeah that's so that's the other problem with trying to draw the trajectory at the same time as as drawing the estimate from the video algorithm so I'm just I've just spun it around a little bit so you can all see what the global trajectory looks like if you've got I do pilot to fly like a big figure eight or something that would make it easier to to match up yeah for sure for sure that's something we should do as well but actually I I'll talk about so the issues in the algorithm at the moment first of all feature detection and tracking are the most expensive operations were doing in our system at the moment because you need to do image processing which is is it's a lot of repeated operations on little bits of the image so you can reduce the cost of that by using a low resolution image but then you have a low resolution image so it's less accurate it's a trade-off velocity estimation the philosophy later we've got is not very robust outliner so this is the problem Rob talked about before we're using at the moment a reasonably traditional algorithm for the velocity estimation again it it doesn't respond well to features that are tracked very badly then the equal variant algorithm the actual algorithm that we've developed doesn't do as much as we'd like it to to improve the pose the position and attitude estimates but as to the fourth point this is really good to do with the gauge invariance problem that brock talked about before that ultimately you can choose any reference frame you like and it's all valid so it's it's hard in that way to defy to create an algorithm that does a really good job of pose estimation when everything is relative to your reference frame to an arbitrary reference frame and that's why the local estimation of the algorithm so the local velocity estimation of the algorithm ism is quite a bit better than the global estimation of the full trajectory so as to what we just said about flying and figuring I expect that the local curve that the robot flies according to our algorithm will be recently accurate but I wouldn't necessarily expect it to be very accurate in actually drawing out a figure eight okay so thanks I said questions but I don't know if already for that yet compact aroma curls thank you um I guess I also had invited Jim along so Jim presently works for Intel I believe he's actually about to become a free agent so but he had a lot to do with the development of the realsense I haven't actually had a chance to exchange many emails with him over the last week so Jim do you want to have a quick chat what would you like to do hey Jim yeah your music Jim hi III thought I could just tell you a little bit about the G 265 if you haven't already seen it before let me share my screen see if that works you see this great so the G 265 tries to address all the issues that I've just discussed it's got fisheye camera is to deal with the field of view issues it's got an IMU so we can get scale so we have both stereo cameras to get scale for close things and I am you to get scale they're large things so generally the problem with mono cameras as they you can't really figure it big you out the scale but with stereo we can do it in with the IMU we can do it we started off originally on the iPhone iPhone 4s back in the day this algorithm comes back there and it was a mono you know low res image with an IMU it was the first camera that had that and over time this is grown to give work with stereo and to work at faster rates and now to be a fully embedded solution we've worked really hard to make it low latency so it's only a few three or four milliseconds latency it's 200 Hertz output to arrest the speed of our gyro so run the accelerometer at 63 Hertz we run the images at 30 Hertz currently I'll give you a little - if I can give you a demo of the actual algorithm running I don't know if you could see this please tell me if you can or not seeing the slides at the moment seeing the to see the the when you try again it made me choose a window rather than just going where I went which was a bit annoying yep let's see not this one alright can you see this one now aha so yep this is the same sort of setup that we saw before this is going to be a drone flying you can see I don't know if you can see this in real time but this these are the features we're tracking exactly the same way that Rob was discussing we're actually have stereo so if you I can switch back and forth between the left and the right that might be a little slow but you should be able to see it we do much fewer many fewer features than Rob was discussing because it's embedded low power system we have to deal with all the same issues with matching features and these sort of things rotation is challenging especially if you go quickly so if you buy this device I recommend not rotating you know we're much better if you don't you'll have less traffic etc so you can see we're reconstructing the path pretty reasonably here and so that's that's pretty much the summary it's the same exact problem that Robert was discussing we tracked little features we integrate them over time we currently use a big common filter so we have a monster dkf where everything goes in mu goes in the features go in etc we do have the scaling problem where as this gets bigger and bigger it scales with N cubed which is horrible so we're limited computationally so over time we've used fewer and fewer and features when you run this on an x86 you can run hundreds you know hundreds of features when we the embedded system we run significantly less than a hundred states total including the features and I am you states etc so we're estimating IMU biases were estimating the position and location of the device itself and we're estimating the locations of where we detected features over time so there's kind of a bit of a mini bundle adjustment going on but it's all inside the decaf so it's a it's very similar to the systems you're probably already running to estimate your attitude position etc to estimate your velocity or linear velocity and angular velocity etcetera so we have all of those in there all of them are in the big carbon filter we have acceleration we're also estimating acceleration as well as velocities rotational and linear and that's pretty much it we've done our best to you know output this USB as quickly as we can and as low latency as we can we don't do what as Rob was discussing optimization because that's slow and latent in general and so we process things as as fast as we can as quick as we can the algorithms generally asynchronous we get in IMU they can run at different rates but they're at least all on the same clock so one of the problems when you build this stuff yourself is getting time synchronization good accurate IMU data you know and accurate time synchronization and if you were to use the system for example at least that part would be solved you don't have to use our algorithms instead of just taking out of pose you can just take the raw data and try to run your own algorithms on it but at least in that case you get nice time synchronization a lot of people try to run synchronize everything so that the I am use the rate is the same as the image rate we don't do that we run completely asynchronous anything cool super super um I don't have anything more to say and we've got Jim here we've got Tom Peter and guys do you treat do you want to see about questions and whether people wanna yeah so there's a couple of questions to come up in the chat I'm also very interested in the possibility of us running like an experimental system a new system like Peters one at this in parallel with the existing algorithms on the t26 five so could we have the image feed we can so it's 30 frames per second yeah so that's and that comes out over USB is that correct the three frames USB three if you want to get everything that's time-stamped isn't it as long as we play yeah so we should be able to then run using basically the same image data and run two different algorithms to get a comparison vibe is the baseline that really in our community it's the reference that we're all comparing against but the experimental stuff that's great and what are there ways in which are do pilots extra senses because I do pilot knows it has you know bias estimation and things that it can do because it has additional sensors could it provide additional information to the to t26 five to assist in its yes so we support a dhama tree input currently so for example four-wheel robots will allow you to send in wheel velocities like to sort of reduce the drift especially around bumps when you rely on the accelerometer as much as we do we do it matters significantly if you get two bumps we just haven't solved that you know we haven't spent much time at all on it and we blow out often if you go for bumps and from the drone from the drone perspective also we get scale for the stereo when you're close and from the IMU but when you're up high the stereo is not good your baseline is so large that it's not useful and so you would only get a scale from the IMU and so you really have to work on vibration isolation in order to keep the accelerometer amador massive measurements less noisy because we do rely on so fundamentally in order to get but to address your point previously the best thing you could do is record our data raw and then run all your algorithms all their compare our pose to your pose etc I mean I you know we're not perfect in any way we'll have plenty of problems but at least we're also running mapping on top of this so if you look to the last bit look to the right and you come back and you see the same features again we will recognize that an attempt to make your pose similar to what it was last time we saw that so the algorithm that Robert just described if you go look to left look to the right lose all your features look back to left the odds of you being back at the exact same pose or nothing it's more or less Brownian motion every time you come back to that same pose and we've put quite a bit of effort to create a map of all the features we've seen over time now of course we've run out of memory it's an embedded system we don't have a ton we and we don't algorithms aren't particularly optimized at this point to deal with the large map but if you stay within a small scale we can make your pose pretty robust to going in circles multiple times etc will eventually lose track do you output the yes we do actually output the map as well we currently haven't opened up the format of the map that's internal discussions that we were having so we'll see if that ever actually makes it out so we we currently output a binary blob I guarantee trade config you know reverse-engineered in 15 minutes like it's not complicated we have not obfuscated it right it has what you'd expect image patches poses and note you know edges right there's no there's no magic there yeah so regarding the angular velocity limit you're saying you know don't rotate too fast that's only because in terms what do you mean a constant I mean is that the same writers that depend on the environment what we do with our optical flow for example is the EKF tells the flight controller then vacation controller what limit there is in what limits it should provide to navigation to allow the optical flow to maintain lock so could we you know no no we don't provide feedback and the really the limit is because our features are not rotationally invariant actually so we just have to create creating features over time and had the more her features you create the more drift you're gonna have if you contract the same features as you just sit there and spin which you obviously should be able to do and we would be able to do where we'd use rotationally invariant futures you would never drift you would just sit there and spin but because we don't do that and you might have seen this in my video and when the drone rotated we ended up losing the features after 30 degrees or 45 degrees maybe it's okay we'll find new ones and we'll keep going it's not a problem like fundamentally right i mean every time you take a step forward or scale up right the features you see when you start on the ground they're not the same features you see when you're 30 meters up right we constantly see new features as you go through the environment but the more you have to do Adamo tree integrate your equations going forward and rely on that the less accurate we're going to be right if you fly low over the ground and try to add up the motions it's gonna be horrible if you go up high and fly we can see the same features the whole time it's gonna be great what the ring is can we get I was just wondering about these style scaling issue as you get higher if you're looking down to the ground and you've got a laser rangefinder looking down at the ground actually giving you range to ground if you were able to incorporate that information into the solution well that'd be you've yet so we would tribulation coorporate that into the solution if you could give us the range to this set in well he arranged because they you you don't always have we did which you're aiming at it yeah but we could have all started with the with the camera centroid so and Marjorie yeah then that would be perfectly fine if you if we could say that you know over this image patch that I have the range here and if I had features in that realm that we would a hundred percent be able to use them hmm the thing is we have a bit which it's a big old EKF right you can throw in any kind of measurements you could think yeah you could add in anything else we add things in all the time let's try using this we get depth from a depth sensor right it's all makes these little depth cameras etc right we're coming out with a lidar depth sensor yeah momentarily commence rotating mirror depth camera that's better right you can you can use just a mono camera and that as well and apply these algorithms right on top of it we're not particularly weighted to our features or our tracker or anything like that we just have a default set but adding in new things like our visual Oh document on our wheel odometry are relatively easy but you know we have to be sufficiently motivated to do yeah but I think what I'm hearing you said certainly in terms of mounting this camera and putting on the drone it should just man it on with some velcro wall or hard so because we have all the data right we give you the IMU data you can actually look at the noise and decide oh look maybe one and a half G of noise and accelerometers too high you know we should consider isolating that put some sort of gel mount on there or you know something like that the you know Intel actually makes drones as well we have ludicrously large drones that we sell commercially and the teams there work really hard on getting the noise now for the accelerometers etc hmm just because we rely on it so heavily I mean cameras were effectively mono at that height and you just can't get scale from the image right so if you want to recover your path any reasonably with meters as its unit then you need to measure meters and the only device that measures meters is meters per second squared and the acceleration that's the sort of intuition around this so just the cameras don't measure meters right if you were if they in general you just don't so you can't get you can't accurately recover your trajectory without major media it is it is important though to think about the use case for ardupilot right so when you're flying in open space and reasonably far from the ground right it's almost always going to be a situation where you can have a GPS and actually provide you know real scale through GPS yes well yeah that's the ideal location for the GPS right no multipath nothing like that yeah exactly yeah so the whole point here is to be able to deal with you know in a sense I see the use case the ardupilot is to extend the the ability of all of these systems to operate when you're you know landing in a forest or you're going into an urban Canyon or any of these situations where TCS becomes compromised and you you you know you really need that precision to be able to fly close to a wall or close to a tree and not have a flyaway yeah so that's not one of the things that's been on my list personally was to integrate GPS measurements directly into our filter right they're pretty low Dan tional compared to the other things work we're doing things I'm keen on doing with them Paul and Cohen this community is really integrating the the vo into the existing EKF frameworks but to do it in a really seamless way so you know we're not having reset to the video that are putting offsets into the reset to the ek their concept is the fundamental issue that everyone has right so generally if we blow up and we currently do you know it's sad and there's no good way to restart again other than just starting from scratch you know so that I am you passed out a reset signal so that because I mean at the moment when you're slow we output mounds which you can detect using some algorithm all right so you know I wish it weren't man's but it is and so you know you can we act because we have this map though we can keep the map in memory so when you restart we can relocate you bottom our knees with an e K if you have no choice when it diverges when the so we actually don't really have problems with the diverging in general the main problems we have are when you're sitting still if you're moving we almost never diverge when we do diverges when you're sitting still and the covariances go to zero and our eigen values go negative and we blow up that's really where I'm tired yeah yeah I mean no way around that is a squared filter and we're working on it but it's hard to get it to faded in real time right right yeah yeah so I mean that that guarantees that your eigenvalue stay positive the way the EKF expects them to be like our algorithm is you know in order to get this to run in real time we care about every stage of the algorithm every matrix operation we don't people at EKF naive matter we try to optimize everything and reduce the amount of work we do in all costs and so all of those go out the window when you switch to the square and filter yeah yeah so we'll see if then we ever I actually get it to fit in real time such that we can switch to that in here right what we don't want our encourage all your cases can we get the parameters so the the image parameters in such like so we can try and set up and a seam with exactly the realsense flowing around in it what do you mean the parameters so the fisheye lens is the DC well I mean we can measure the distance between the cameras but your actual lens I think we have a spec for that I'm pretty sure there's a I thought we that's in the extrinsic so we are you know we offer the extra tix is one of the outputs from the SDK so the location of the IMU the location the image cameras they're all there we do distortion parameters which are not good I which I discovered from drone flight actually I recommend recalibrating the but in general we have all of that data so you should be able to create the exact thing and run on our recorded data which is what I would recommend in terms of you know everything we do is based on recording tape you know internally right we try to find as many sequences that weekend that we can record where we have ground truth in some ways so that we can you know judge our progress you really like it it's the equivalent of our unit test right is running across all the sequences that we have we have recorded over time yeah um question here that's right sure so hi I'm Randy and one of the multicopper developers and put the t65 on a on a copter in last week and it flew it around we had a g saw student from last year and also couple other developers working on and then it's um it's beautiful actually the performance is fantastic just absolutely rock-solid with with the tg6 five on there much better on with the GPS so the stays on the person and is just trying to help get it robust so i just wanted to make sure that I understood them how to how to detect properly when the Tiki 65 is like losing its ability to provide a position or an attitude and so I guess the things or actually I just want to make robust I just want to get your advice on on the steps the most important things are going to be don't rotate about the camera axis just because of this thing we're not gonna die because of it but you know just you know if you're looking out this way don't rotate this way so your downward facing for example don't spin the drum like you contribute avoid it just choose not to and there's no like a rate that I shouldn't spin at the vehicle we can handle it just fine it's just we're not as robust with it okay so I'm just giving you minor optimizations that you can do right you know make dampening on the excel right make sure you don't have additional noise on the accelerometer so you should look at you can look that you see the raw data in the little viewer we provide or you can record it and look at it and graphic and see if you're getting a 1g of noise in the accelerometer it's too much see what you can do dampening wise to give it down right fifth of a G maybe you know I don't know if I have any particular estimate where you should be but you should care about that and the other thing is you know motion blur is that our current issue right now we can't get our exposure or quick enough that we don't get motion blur and so we do and so quick rotations in the other direction where you rotate this way give us motion one mm-hmm so those are those are our key issues right so yeah that can be fixed with a little bit of effort it just hasn't been our focus yet in terms of getting our exposure times down low enough right we were meant to work indoors we the system was designed to work for AR and VR context you're gonna put this thing on a headset and so we work at indoor environments where we have 50 Hertz lights and 60 Hertz lights that can cause beating in the images and so we have some anti 50 Hertz just but what's it called well essentially filters to get rid of that right and so we set the exposure that's proportional to the the some multiple of the 50 Hertz in order to get away from that to work in Europe indoors which fluorescent lighting nice one day we would turn that off and then we could get our exposures down low enough that we would have less problems with motion blur but currently that's our issue is that something that can be done manually like we just disabled not on the drone I mean there's settings they don't do anything it's currently broken question I just want to finish off um so the other things that I heard you say that might help providing like odometry input into the camera so if we you know have velocities maybe from GPS you know in the case where we have GPS we could provide that I might improve things sure that the GPS would be better than our velocity estimates ourselves right the places where it's we're where your wheels that are actually closer to measuring velocity estimates I think it's more useful than the GPS per se the GPS is good on absolute but not so much on velocity I expect us to be more accurate than the GPS what through static velocity yeah sir I said I said GPS but actually I guess I meant our EKF which would be you know combined local IMU and the GPS put together we get that you know I see I see maybe yeah I mean you start getting dicey when you're mixing you calves that's my general take overall yeah that's why we throw everything into one big old Wang so that you know you can at least it makes it complicated and slow and covariant with the monster covariance but it means that we can not have to deal with all these issues of passing things between them where you don't know how much you trust each measurement etcetera I think this is something that the community's going to deal with right so at the moment we're looking at vo as an add-on solution to an existing solution and so somehow either we need to move everything to vo like you have but that that's really not a good solution for the community right because lots people just want to fly and they won't so we can run with that like we can run into the three Dolf mod and get orientation right one of the nice things about our setup is that the size of the common filter changes dynamically all the time right as you pick up images or features they come and go you know different locations where you've detected features and we can scale all the way down to just rotation if the system is designed well it can work or these other use cases as well right turn on like we can actually then we can estimate image intrinsic since I'm extrinsic right all the other tuning parameters we don't usually turn them on we turn them on and bit in the beginning to deal with you know people banging their cameras around but the fact is once you have this generic framework for your EKF can change size in an efficient way then you can throw in these other things and you can work in these other situations you have different measurements depending on what comes in so I would encourage you to go to the rather than having these uncoupled systems to try to put them together into one yeah that supports just different sets of measurements philosophically yeah I mean I'm I'm keen on that but I'm also I'm also aware that you know the reality of the engineering of the the drawing systems you know you need to have your each and everything I know so one of my goals for this was to make it so that it could be used in the loop in for real-time control right yeah we worked really hard to make it so that it was low latency and high output rate now we have the couple teams at Intel that are you know doing research on drones inside their lab and they're saying our 200 Hertz is not enough for them they want us to go to 500 and they're saying that their controllers are working law and so they can work with our 200 to go I heard you guys say 3000 which were not quite gonna get to anytime soon I believe we could get to 500 I'm not sure we could get to 3,000 either way I think because of all into - one thing is something you should consider yeah go ahead I was just wondering can-can you rigidly couple the the depth camera the the d4 35 with the t2 65 to get get a more precise spatial alignment and and tracking yes we but we don't have so yes you should we said we actually give STL file for a mount so that and so that the intrinsics will be just known and you can enter them in like we give the extrinsic SIF you use our mount then you can 3d print so you can attach the two together so that you can use the depth you know you can create a 3d model relatively straightforwardly accurate it's going to be overall but it'll be pretty good because the boat you can combine the pose for the to 265 and the death from D point 35 5 for example right that's what I was thinking could could overcome some of those those issues yes particular we you know we have modes internally where we can use the depth from the D 435 in our filter for example right but it turns out the depth from the D 435i is pretty late and so we we don't even export that generally we're trying to make this a real-time system right we we our goal was for you to be able to do the feedback as I said before right your actual path planner should be able to use us in the loop that was our vision and to do that we can't wait around for the depth over USB or whenever we are talking just yeah we know at least them with the last time we tried it the latency was too high for us to be happy with treat right and Olivia's in the side Chad has been saying he's fine about it going on so we probably finish up maybe 25 past the hour so in a couple couple more minutes if there's a few more questions it's been a fantastic conversation Oh greatly appreciate you coming along as well Jim really appreciate that that discussion and I imagine there'll probably be some follow-up questions from some of the people who are working with the 265 really closely so there any any more questions now for free the Rob's team or Jim before we move on yes of course it's possible as I said before but it's unlikely to happen you know so the chip we have has a support and I'm sure we can get data out of it pretty pretty easily but getting it all wired up and then different bomb etc is not gonna happen unless you tell me you're gonna sell some large number of them right Artie this product doesn't sell enough for anyone to be happy you know I encourage you all to make an actual product and sell lots of these in order to keep this project the tricky question can you have yeah so we have this issue where the power to the movie deus comes on at a different time compared to the USB like so there's a power sequencing issue with respect to the USB and the mobius again this was meant to be used in some wasn't meant to be released in the world the way it was in the end and we thought we could work around it but now we see that everyone's running into it and so pretty much you have to cycle the power on your USB port in order to deal with the issue when the bootloader detects the wrong USB version essentially so it sometimes if your USB power comes on after the Mobius comes on the the bootloader on the Mobius chip thinks we're in USB 1.0 instead of 2.0 and we can't talk to it ever it's a sad state but if you a lot of devices like the Raspberry Pi you can actually control the power on individual USB ports and if you cycle that you can turn it on and off USB hub controls an open source program that you can use to do this was on my list to integrate this engine lib real sense so that he would just do that if it detected it but it's not gonna happen yeah yeah it it works it totally works so the suggestion to use you hub CTL works on Raspberry Pi so yeah we've got an image now that doesn't automatically so the user because have to worry about plugging and I'm putting down yeah I wish that without the case but you know sometimes there are bugs all right well this has been absolutely fantastic thank you so much Rob and Jim and and Peter and Ryan it's been a brilliant presentation and a lot of very interesting discussion so I think a round of applause for our speakers from this morning thank you very much you
Info
Channel: ArduPilot
Views: 2,800
Rating: undefined out of 5
Keywords:
Id: nO_y6BRBBOg
Channel Id: undefined
Length: 84min 58sec (5098 seconds)
Published: Thu Apr 02 2020
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.