Machine Learning for Autonomous Vehicle Perception at Cruise

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
good morning thanks everyone my name is Sean and I'm from Cruz so Cruz is a self-driving car company we are a 1600 plus person team we've raised about seven billion dollars in capital from GM Honda Softbank and a variety of others from our HQ is in San Francisco and our mission is to deploy driverless vehicles at scale so being part of self-driving cars is really exciting mission whether it be just making your average commute a little bit more enjoyable or giving mobility back to people who need it or ultimately saving lives it's a very rewarding mission to be a part of I'm an engineering manager for the perception team at Cruz I've been doing did a PhD and reinforcement letting I've been doing ml and robotics for about eight or nine years now and I've been at Cruz for about three and a half years and so I'm excited to take you through what solving machine learning problems at Cruz is like because I believe that the future of ml really lies within autonomous vehicles okay so let's start with the problem the problem I want to walk through with everyone today is called prediction we talk a lot about object detection in literature but prediction is equally exciting and if not more challenging so the problem we're trying to solve in prediction is we're trying to understand what does another agent in the world want to do in the future where are they going to go and because we only have 20 minutes I'm gonna simplify the problem a little bit we're just gonna talk about intent at a high level where does another agent want to go what do they want to do we're not so worried about how they're going to do it just about what they want to do and so in this example that I have here maybe the Cruz car is in Reds and there's an oncoming car and we're trying to understand what do they want to do a going to go straight are they gonna turn left they're gonna turn right and so we can we can structure this as a classification problem we can come up with a bunch of different actions that can can all be our classes there's a few different papers that use this kind of problem representation and then the input to this task is basically the state of our object that we're trying to predict so its position its velocity its type so is it a car or a truck or bicycle or pedestrian its shape maybe its orientation and and where is in the world and what we're trying to do is is understand what is it going to do in the future so like any good ML engineers let's start by looking at the data so we want to build datasets that are large and diverse Cruz has over a hundred cars they drive around downtown San Francisco and collect lots of data that's very exciting here is our car trying to navigate the perils of school drop-off time in San Francisco which as you can see is is full of a lot of kind of chaos around the exciting thing about driving in an urban environment is although it's very challenging we get exposure to very exciting and very unique events that you don't see much of on a highway or in a very quiet residential neighborhood probably build very large datasets but also very diverse datasets as well it's not just about building a good data set though we have to really understand our problem and our data deeply in order to train a good model and so using our example of a car at a four-way stop from before if we go and look at all the different four-way stops in San Francisco it turns out there's a huge variety of different four-way stops that we see on the far left is a nice four-way stop where it's sort of nice and t-shaped and very uniform and then on the right-hand side there's a variety of increasingly kind of wonky intersections where they are still four-way stops but they're not nearly as square or sort of t-shaped as the others and so that it's important for us to be able to understand these insights so that we can craft our model well and understand and deal with all these different edge cases so we invest a lot in the tooling to make this kind of thing easy to discover some other insights that we might gather is around how much people behave as they are supposed to on the road so this is a different four-way stop with one car lean in every direction and also a bike lane in every direction and the green represents what we when a car doesn't maneuver that we expect it to do and black is someone doing something that we don't expect them to do based on sort of following the laws and then the thickness of the arrow is is how frequent it is and so as you can see there's a lots and lots of black lines here if people doing basically every combination of sort of unexpected maneuver there and so we aren't able to just assume people follow the law especially in downtown San Francisco people people drive kind of how they how they feel like it okay so the last part then of our of our data set is about labeling it and so being that we want to you know phrase this as a classification problem we have to have to generate labels and the exciting thing about the prediction problem is that we don't necessarily need a human to generate labeled data for us if you think about trying to predict is this car going to turn left or go straight if we wait five seconds and we see what this car does we can say Oh made a left turn and we can watch what they do we can use our future perception information to label our previous predictions and so what that means is not only can we build a large and diverse data set we can label it all up so we get this huge volume of data for us to play with and that makes the prediction problem a really really exciting machine learning problem okay now that we're excited about the problem let's think about training the model at cruise we've invested a lot in the ML workflow so that as an engineer all you really need to worry about is what's the problem I'm trying to solve you don't have to think a lot about the infrastructure or how to track your experiments or anything like that when I was doing my PhD I spent a lot of time running training models on my desktop computer they took very long time days sometimes weeks I couldn't use my web browser at the same time or at risk crashing my model I would save my models with hyper parameters in the file names so that I could remember which which model I had trained with what and these are things that we don't have to worry about our crews because we're building out all the tooling so that so that all that's easy and all as an engineer all you have to worry about is the actual experiment that you're trying to do so let's talk about the experiment that we want to learn let's learn a simple classifier for it so let's pick five classes the most common five turn right turn go straight left lane change and right lane change and let's learn what a distribution might look like across these different classes okay so once we've trained a model we then need to go and evaluate it so let's look at what evaluation looks like so here's some tooling that we have developed accrues and again we've been we invest a lot as an sort of ML first company into what this workflow looks like and so up the top you'll see different attributes of the data being split out so we can understand so we have classification so the actual class distance from the cruise car time of day you can imagine any other action beaut that you might care about and on the bottom half we have different examples of that data either successful or unsuccessful and so in the prediction problem we see some very interesting insights from this we get a very big class imbalance in prediction most of the time when people are driving they go straight you'll just you spend most of the time following your your lane occasionally you change lanes or turn left or turn right but as a sort of percentage of your time on the road most of the time is going straight so understanding that data imbalance is really important and then being able to look at all the different failure modes together helps you build intuition very quickly about how your model is performing and so if we take an example maybe we look at all the different times when we think a car is going to turn left but they actually go straight and maybe we look for all those different examples maybe we see okay every time this happens is when we had this kind of wonky intersection of out of 4-way stop where there's sort of an angle in in the road there and then what we see is okay well as someone's actually going straight through that through that wiggle they they sort of have to do a left turn first in order to sort of follow that path and so it may look a lot like a left hand but it's actually a straight maneuver and so those kind of insights are really important to help you develop a model and really solve a problem and solve all the different edge cases you know something like an f1 score or precision recall curve is a nice place to start but if you want something that truly generalizes across all the different problems we see in self-driving we have to get much much deeper in that and so this kind of tooling makes that very easy we also have some great 3d simulation environments testing so that we can test stuff offline this is our cruise car driving up some San Francisco Street with a nice trolley car and stuff in the background because understanding your when you have a model it's not just about understanding your models performance but also about the impact of that model on the overall system and it's important that the car drives better as a result of your model getting better and so we need a better validate as much as possible of this offline before we put it on the road so that we can we can be safe in the things we actually put on the road and so the simulator is great we're basically only limited by what we can imagine here's an example of us we were looking at pigeons in the road and how our car might behave and you can sort of vary this environment for with whatever your liking add more pigeons you can make them take off faster or slower you can change the size of the pigeon and do sort of whatever you want in terms of the it's only limited by your creativity in terms of what you want to test okay so let's have a quick look at what this might look like in practice so this video is going to play a few times so we'll sort of go through a bit by a bit up the top is sort of a bird's eye view of the situation and the cruise car is on the right with the sort of red at the back and we're interested in this oncoming car that's going to make a left turn in front of us down the bottom here we're showing the probability over time of each of the different classes of prediction and you'll notice early on so the the top line that which is green that starts very high is for them going straight the yellow line that that's just slightly below is for them making a left turn and so you'll notice that about here it's kind of unclear what this what this person is going to do they've stopped on the other side of the intersection and they're waiting to see what's going to happen and so during that time our model converges on some some uncertainty it's about 55% that they're going to go straight 35% that they're gonna make a left turn and that's kind of reasonable and then as soon as they start to move we see the probability quickly switch where they as soon as they start to get some angle and some turn rate we say okay very clearly they are making a left turn and our model can quickly converge to that and then you'll also notice a nice T bit right at the very end it will switch back to straight as they then exit that intersection and start driving down that lane again so here it goes into left turn high probability of left turn and then at the end it switches back to straight as they finish that maneuver and now going straight down the road again and so as you can see we can take this this reasonably simple representation and learn a great model from that and this is our car driving in San Francisco on one of these very wonky four-way stops and because we have such a big fleet we drive through over a thousand four-way stops every single day and so it's important that our models generalize across all the different things that we want to that we might see across San Francisco because this is a safety critical product it has to we have to deal with that long tail of different things as well quick shout-out to our our web viz tooling which was from the last two slides is all open source so if you do any robotics work or anything in Ross you can you can use web visits a great visualizer you can grab that and you can contribute to it as well it's very cool so we like to open-source things when we can okay now let's sort of put all this together and look at what this will look like once we deploy it onto a fleet what's really exciting about the prediction problem is that because we don't sort of need a human into the in the loop to understand if we've had success or failure we can now build this this loop this nice positive feedback loop that drives itself and so what that looks like in practice is we look for failures where I model doesn't predict the right thing we can mine them store them label them and then add them back into our training set and train a new model and what that means is our car drives all day watching lots of other people and seeing how they behave and model gets better just by being out in the world and experiencing what everybody else is doing and that's really important within autonomous vehicles because we have this long tail of challenging problems that we have to be able to solve and what it allows us to do it allows us to sample these rarer events in high volume and pick data that we actually care about most of the time when people are driving straight we don't care about that there it's not very exciting data sets for us we want to find all the really interesting and they're really rare events that are on the road and that's the stuff that we want to get labeled so this feedback loop we call the continuous learning machine at cruise and it's really important for us to to address that long tail okay but you know we look through a bunch of simple examples unfortunately the world in San Francisco in particular isn't that simple here's a nice scene of us approaching a four-way stop where there's a lot more things going on than than a regular 4-way stop there's bikes coming through another truck also trying to squeeze past an unloading truck there's some cars coming out from behind other trucks even this pedestrians gonna make sure we're still paying attention so the real world is actually really really complicated there's even a guy driving a forklift there's also a person inside this truck that we have to worry about who's gonna come out and so what's what's really exciting about this is that we aren't just solving really simple problems this has to work in the real world and it has to scale and it has to generalize out into all the different things you see in the world and that's really exciting and the problem that we went through today this sort of simplified version where we learn to classifier that's you know that that is inherently a simplification on the problem prediction is actually a multi-agent problem if you think more about it and we can't predict each each actor in isolation what one car does will impact what another car does which will impact what a bike does and what another pedestrian might do and so even the very simple problem that we looked at today is only the tip of the iceberg there's so much more to this problem going on and every day accrues we are doing applied research to try and solve all these different problems and it has to work across all the different things that we see as a safety critical product and that's what's really exciting okay so let's summarize what we've gone through we looked at the prediction problem we simplified it down turn that into a classification problem where we wanted to learn across five different classes we started without data we built really large diverse data sets we gleaned some great insights from that data about what our problem was gonna look like training is pretty straightforward because we invest so much in that ml workflow and then we have great tooling so that you can evaluate your stuff really easily and then at the end of the day we have this beautiful loop that we can make where we can create this nice positive feedback loop and your model can just get better and better and better without you having to do lots of work all the time and so the last key point I want you to so think about it as you leave is that the future of ml really is happening in autonomous vehicles this kind of application of machine learning is very different to what you see at most big companies it's a safety critical product you know people people don't die if you show them a bad ad but there is a lot on the line here and so the the standards of what we're doing in ml here is very very high and really really challenging but also really really exciting and the AV industry as a whole is investing a lot in machine learning we know that the future of autonomous vehicles is going to have a lot of machine learning in it and so we are looking at push machine learning through to the next layer and so the future of ml really is happening in autonomous vehicles thank you very much [Applause]
Info
Channel: Cruise
Views: 19,263
Rating: undefined out of 5
Keywords:
Id: -UPfyvDJz9I
Channel Id: undefined
Length: 17min 9sec (1029 seconds)
Published: Wed Jun 17 2020
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.