Francesco Borrelli: "Sample-Based Learning Model Predictive Control"

Video Statistics and Information

Video

Captions Word Cloud

Reddit Comments

Captions

I'll be sure this is the last talk here a mystery take as much time as you place okay and uh man also think kind of at least 20 of you have already seen this sorry eyes yeah it's I've been presenting you see a lot of videos and the theory that I present here will be it's from Google Sniper tech Monomoy Charlotte industrial sponsors and especially a tank to visit from onr was been sponsoring the latest research that I'll show you with Ben so here's my flow some notation I don't know maybe I don't need this part many did an excellent job my cost efficient learning so that we is we all know what probably I am solving and to simple and useful ideas this is have been presenting this now for two years a year enough and it's always the same idea but as my advisor said you need at least five years but time it people really understand it I don't if it was referring to him but when I was presenting keep telling the same story and then there's the new one it for people have already heard this I have a first class and working and learning which is quite interesting some application conclude okay here's our if Stephen Boyd would say discipline way of doing a control design I want to bring a ball in a cup and by swinging and catching the ball and I can solve this as a finite time optimal control problem or a minimize look for a set of feedback policy from 0 to t the minimize the cost I have a ball in a cup model speed and velocity in x and y for the ball speed the velocity in x and y for the cup if it's a planner model and some constraints the these are feedback policy I have stayed in constraints I'm not you're not Superman I cannot go super fast on the cup and eventually I want to a time T the ball has to be in a cup okay so this is there's no if you think about it I actually want to replace this if you think about what they teach with inverted pendulum in classical control and ball in a cup there's some other you there's an only a spoon of theory here I just want to solve this which is a finite time task I suppress a set of constraints and a dynamic model okay you have two options you read professor bit Sega's book and what I call the mostly offline base there's a lot of work that goes offline and you compute a policy okay and then your real time you could you solve this policy at your major state okay dynamic program is one choice I should have said no choice if you have a run and then I'm programming or green greater than five you start reading this to the curse of dimensionality then you need model reduction model aggregation is all the word of approximately nine program okay the other option is the NPC community that says I am going to compute the policy at my state the one that I care about with the some our sophisticated online uh so offline is very simple just set up your optimization problem and online you run and into your point method mix integer optimization if you have integer variables and you get a solution for that state that solves the finite time task we know that because the rhythm is long and because of model uncertainty you actually want to solve repeatedly and that's why the only way to do this equivalent to this is MPC okay and so if you this looks like if you ever done the PE this is looks like your iteration as you go your computer so double integrator with constraints in the color the area you have the policy in the wider you have no the number that's the domain where your policy works okay and this assume that you converge the optimal policy PI zero if and it's this idea was basically the old and I'm parameter this getting rid of or solving the fact that time is long you just do backwards one step but now of course dimensional because you have to agree the state NPC gets solved sauce the space part but there's a problem with time because now you start from X you you solve a longer isin finite time transition problem and you see you don't need to grid you actually get a solution on the point that you're interested in you have the policy at the states okay these are numbers for specific state so this is an MP's this is an optimization solution you gave it an optimization problem because as I say the reason is long I'm measuring the ball in a cup compared to the sampling time you these over a moving horizon and the only reason I'm advertised the book is since last month you can download the PDF for free so and I've but I forgot to post it online but I get the agreement with Cambridge is that after three years I could post the PDF for free so Cambridge agreed and I should have to post it online so the T the T becomes now the capital T becomes n and the MPC is about PI zero the first policy at the current state okay roughly two hundred papers every year that tells you that you because you're short-sighted you need said at the end of your Eisen and costed the terminal cost in the ball in the cup you need to think about bringing in a state from which then you can catch it okay which region state spaces is complex and you need to compute these objects are complex to compute but with the computer science board says they don't need a constraint set this is actually the domain where the cost function is defined optimization people understand that that's quite different to have the constraint or not but that's technical detail we can talk offline learn micro sufficient learning the idea is that I want to use observation from iterative tasks to succeed very simple okay this is my son trying to solve the boiling a cut problem the cop is quite small so you need to be reading on how you predict and now you solve our UI you guys took up right so very happy of doing that and so if you think about our final time optimal control problem and there is really five parts for learning and there's you can use data to update your model mostly parameter estimation you can do data to update your uncertainty so that your policy is actually brings you to your target set you succeed okay I robustly the spider uncertainty on model uncertainty and the sensor output uncertainty you can use data to work in different environments these are a constraint on the states and this is a model of your environment you can is data to get better with your performance you can estate actually to speed up the evaluation of the policy okay and it's basically learning as all these five flavors for me and let's start with the simplest one well the simple is the motor actually the one where you could spend either in two seconds or years let's you know we have a mod yes okay models okay you have paid the parameter of your models okay you use data so that your policy becomes robust to your task completion you know that sixth year you complete your task whatever your uncertainties you use data to update your and certain model okay third one is use your data to had to work with different environment talk about this okay update your cost to go performance and then get better dude solve these pies you're faster that's for example one by one yes yes good point yeah this is this this is what's the next slide okay so I lose a lot of state space mostly so common assumptions that we need the model okay and there's a lot of good reason about this I I work a lot with euler-lagrange equations but everything when I say it works for fearful normal more the macro decision process the learning aspect in this is mostly you know it's what all controller would call system identification of parameter and if occasion okay or characterizing the disturbance okay in some of the presentations go back to a question sound the presentation that I'll show you we have a PI of Y okay I'm not gonna say much about it okay there's a sale estimator the say this dimensions to be done robustly and the only thing I'm going to in this presentation this is a hard problem okay especially if Y comes from a camera as you will see next okay characterizing actually then certain and you have here this w is complicated because especially when you tune your slams you know which is your localization Agra min door and then you bring it outside what is your V with which model are you neglecting see it's complicated okay so just let me clear let's I'll show you three examples and we are working on the race car we should we should give credit to you see a lot of raising Chris Curtice from Stamford that's why it comes from okay it's the first one they said let's try to raise autonomously and you know that's where we all started it is actually interesting problem because as the you do need you have all it like I'm doing questions okay these are the dynamics and these are your kinematic part they go from the body frame to the go to the frame and these are the things you want to learn these forces which are a function of sleep angle sleep pressure I'm teaching this semester this class and they have four lectures on forces okay and actually they're what you do you start with the more nominal more than you try to adjust this model okay and so this is the learning part it's actually characterizing these forces what we use in this media that you see is the following you have the kinematic equations you know them well you just linearization okay and now for this part here that's a nonlinear part and the forces are complicated function just use a local linear regression as bench what is a good idea what are the tricks here they make it work is that this kernel function used local data run your current state very good idea okay don't use data everywhere use fear certain speed used at around that speed okay second models based positive if axle engine acceleration is function in lateral speed and your rate and nothing else don't put 0 in the linearized matrix for the other states very good idea right you don't want to estimate things which have nothing to do with physics that where the fix doesn't tell it should be there together okay so the racing control prom is a minimum time problem of what it makes it work in our X slides in my videos is that I use differential GPS and also increase carry this videos okay you have full access to the state and this is very precise okay that's why you have these super cool videos of high speed because this the H is actually the full print so there's very little there's no noise on certain you have full access to state known process noise the other thing which is interesting about this problem there's a feasible trivial solution you just drive on slowly ray it's feasible to drive slowly centerline you can extend this is a very this is specific to this problem ball in a cup switch the euler-lagrange if this distance between the ball in the cup is less than the length of the wire it's free-falling okay of the ball if it's fits equal to L then you have a complementary constraints and the ball route is around the cup okay it's a switch to laganja questions you need integer variables that's not the biggest problem this thing the we have two problems with this the first one we're doing we're estimating the position of the ball with the camera and so what we do here is so here is now your output equation and you basically use iterative task to adapt the model of process and certainty and noise uncertainty so here Monomoy if is is this a real time was record just recorded the video sorry akka is not stream but Monomoy made ten beautiful slides about this sorry Monomoy i'm not going to present them is basically if you have followed you talked before why it's a similar idea where you have here would be modern remodel the process is a parametric probably distribution function we are on a bounded support and as you collected more data you you your support shrinks okay and so you get a better better approximation of the support of w by using confidence bounds okay there's another problem with this ball in a cup the feasible solution is the issue in the first place right how do I figure these up okay so the feasible task is non-trivial you have to saw either you solve a long forget about MPC you can't solve it and PC you either you solve the whole problem with the rise of T and then you have to trust your model or you don't have a feasible solution okay third problem I'm working with the rubber ball and I don't feel play with this this I'm working with the develop of Bubble Boy it's a super cool game and my son is really good at this you are given you have to bring the ball in there to the target and you are given a list of objects and you have to use this object and face to figure out to solve the problem this is the level zero as I'm there 20 levels up to 250 if you pay one dollar 30 pieces wood metal each one pieces a different release coefficient so the contacts works differently and so here is for instance to do this you start putting 1 places and food pieces you do this or it goes the other side then maybe I have to put this there okay it goes like this didn't succeed and so on ok so the control variable is just the initial state of the pieces okay and then you have oily Lagrange equation ok and with a lot of switches and so the question is are my sons solving this right so and this is a PhD thesis right don't you think so to solve a bubble you have you can have a PhD to write down all the equations for all 30 pieces contacts and solve being mixed integer optimization problems or you can do something smarter ok so we are working the game will be the game will be public I'm working the delica so you're going to get either the images so if you were going to get the traces with the events of contacts ok so you can apply your best learning algorithms if you can solve the 120 level 20 or 50 levels the materials you know change the the contacts the release force there is objects that is you place in there you see that defeat gravity's all these things have to be learned as you put place the object in from simulation okay let's go how I do we let's go with feasible so ok once I fixed a model the problem of using data to obtain a physical policy I've already discussed it roughly the idea here I'm trying to drift park a car autonomously as I drift Parkins I collect data so the interesting thing here is that you you have basically a failure and you have to use data from failure to actually succeed ok and so eventually your uncertainty as you get a good description uncertain you're able to solve the task you can solve only this task at this specific time on this surface okay here's our ball in a cup and you try and it doesn't work you try 20 times it doesn't work at the end you have a good description and I forgot or anything and it works okay what will happen here this is interesting because this is one what will happen here what you want to obtain is actually you're not done here you want to run 20 experiments you want to make sure that your probability of succeeding increases right and you will never I can tell you because of a number of things we can discuss in details you never we don't have a control box 20 out of 20 okay performance okay you'll have seen this before this is the simple description is there audio here the best way to hear if you are reaching the tire limit is actually to listen to the sound of the tires that's what I teach my clients where is sound do we have sound oh that's really bad it's so bad yes sound every today needs out as if it was highly like will not work but wait try I think sound oh there's an axe from the sound here yeah okay well you do cars learn this was going slower it's a guy you can hear the tire at this point you can be car screeching squeaking and the UK that's where that's where actually you know that you are at the limit of the friction circle but not going on the other side of it was you lose the rear of the car and you will you know you you will injure the postdoc meatiness nor the Apple data for implementation what do I mean by that this is the car running okay and these are recorded data and running and PC okay once you are sexy once you move your time when goes down okay you basically have a policy at a given state okay what you can do now actually you don't need to predict anymore you use just a locker linear approximation your optimal policy you are now stopping predicting and going in open loop with the policy they recorded okay you see that basically you get the same performance with 10x the speed okay so 10x in from this you must dissolve the time I usually show this video soccer players that record their move this is you are replaying a recording without making predictions okay and this is actually in this MPC setting it's quite straightforward or what you do you have you star for axis you do a local in approximation you're kinda you star for your local X and you get exactly same performance it 10x the speed okay so this we've been working this in five years a lot of papers and I want to show you just two simple ideas that I think are useful for your student to code up okay the first one is about how do you get better performance can hear is super simple idea and I think people should try this if they're trying violet aeration they should also try this okay you are here's my state at time zero of iteration J I run a feasible solution center line and this is my state evolution at at iteration zero I pre index is an iteration okay then I adjust my MPC okay and I have another because I've changed the MPC controller I have the same task with a different state evolution it is state editor action one time one station T time one so there's a lot of version with this you have presented the nominal version but in reality if you fix a policy and there is noise you're gonna get a cloud of points right so you have rather than points you're gonna have sets that propagate forward but I'm going to present you just a nominal version okay so this would be the case where you have two MPC policies and 100 rollout per iteration okay this with the two cloud of points you look like this I'll present you just the nominal case okay what I'm what I'm suggesting is two in the learning NPCs to update cost function and terminal cost of iteration J by using previous iteration and within a way that you get better okay and the way to do it is quite straightforward here's I update the costs the terminal set you have the first trajectory alliteration zero and this on this project you know you've been successful so you have input associated trajectory bring you to the terminal stage the next iteration you use these points which is now a discrete set there's a terminal set for your NPC this set of points is quad control communica is a controlling vine you're gonna add there is a policy brings you there now in uncertain vision this one certain version this will be sets okay and there will be a probability associated to reach the target set you run and PC the next iteration and you can land either here or here or here I can pc decides to land here you predicts n steps ahead you implement the first poly the first input and you keep repeating okay because of you land in the safe set the new iteration will give you another feasible this is a little bit the idea of the safe terminal condition that melon is presented before any way you collect all these points and this is now a safe set you're looking at it okay you are a deterrent Jay the safe set is nothing but the collection of all the trajector they reach the target set what you can do for linear systems and for local linearization take the convex hull of this points always a good idea you can take the convex hull and you now transform a mixed integer problem into it mix your constraints into a linear constraint okay this mile/hour build the terminal set how do you build the cost function okay another straightforward idea you are iteration this is now the status seems to be one dimension and on the wax it is the cost you have a controller you run your controller then backwards you compute the cost that cost on the realized trajectory this was the cost of that roll out okay you now take a linear a bhootish called biocentric approximation to come at the epigraph of the convex hull of all these points in the X cost space and this is your cost to go for the next MPC okay and because you change your costume go the policy changes and therefore your new trajectory will change of course you're going to have another set of points okay this is the closed-loop points with the new trajectory and then you're going to go backwards and compute the cost of a to this point so this is this a barycentric approximations the convex are you it's a convex combination with lambdas that minimize the cue that we have recorded a gift from given the current state access simple as this come so here's my what I'm suggesting your MPC you're at learning MPC is that is a classical and PC where you optimize now also the lambdas and the lambdas are entering this q function so at iteration J the Q as the data from all the duration after J minus 1 and the safe side is all the data up to there some Germans would come questions there are lots of theorems if you do this and everything is perfect then then you guarantee safety and so on but it's what I like about this clean okay there's not a very small modification it is and that was also like about this because there's a lot of people that trying to connect optimal control enforcement learning there's a new book from the meters that I'm actually reviewing for control system magazine as so I have to read it and it's actually it's very good at this very nice connection universe meant learning and optimal control but here's the point remember again the dynamic programming picture right and imagine this is now an approximate line program you're doing value iteration you get your policy here okay on every point in the grid what I'm suggesting is the phone if you run the method that I told you okay you're gonna mission is slow you're gonna start from a state get a set of torture to get better and gritch in the origin here okay so you basically have policy and all this point and the optimal policy only the yellow one you have converged then you run from another state you get first iteration second iteration you have points the policy validus points and only the optimal policy here but now when you take the convex event take the biosensor approximation you have the policy define all in the red area being optimal here and suboptimal the other side okay so I've been finding trying to find a name to this but it is is not another approximate than I programming yes okay after I don't know the real-time adapted i9 program okay so here it's it's forwarded delicious not then I program okay so then okay maybe I've read this car it's very simple area maybe every discovered real-time denied programming yes yes the price is the same it's a it's the same idea with the barycentric policy it's a large sum of lambda we lambda u u so it is the same idea and actually you can prove these two interesting thing about this person Peppa so you will do the policy at X we minimize the same cost okay is the policy minimize minimize the same costs subject to the same constraint here I'm glad that I'm presenting a key present in this so that's the first thing but so it is in the queue it is so the queues are actually the queue collected on the rolla's okay okay the cure the collect on the rollout and they are usually there's a cost to go in the NPC problem so now because of this you can actually have a you you have a comeback you can solve in one shot one NPC because you solve the finite time look ahead as over as well as the lambdas okay as it's a single optimization problem doing picking which what is the best point in the in the region as well as the policy over the finest time yeah yo you could represent this yes you could represent this function explicitly that's not a good idea in general right you want to put that it this is seen and I mentioned this because yeah so do you want to really put this a part of your optimization but you could have an approach you could have an excuse approximation have just Q of X here correct but that would be convex peace worsening yeah yeah that is the usual dilemma do you wanna provide the proof they stuff of the linear case okay that is what this what's interesting about this this gives you a lower bound to your conne in the linear comeback in the constraint linear case gives you a lower bound and therefore you have this performance guarantees you all were meaning you always get better and better at each iteration okay in the nonlinear case you can claim anything yeah so there's a number of tricks that we can do then we do to do block a linear you actually don't take the convex hull of the whole set with a convex hull in a lock a linear gene region that reader is a function of the gradient and non-linearity so we can go in details of this to you yeah but you have a yellow point yes okay okay let's start let's start I'm presenting an idea for the the let's go back here the point here is yeah then I program one side lets us see even if the mod is perfect then I'm programming on one side which is global okay because you're backwards into ingredient state space and thus it goes back where there's a single step interation okay here this is a forward okay it's local you're gonna get its local and sub optimal so that's however there is no gridding okay these are the two there's no okay it's a forward now I'm going back how you predict even on the perfect model you can show that if you do a forward once that prediction is not going to work you will never get improvement your cost to go you actually need to predict more as long as the the controllability meters to lose ranks you have more degree of freedom you explore now in the nomina into the uncertain case that it's hard okay so here's the ear this if you're doing if your students are doing validation when backwards and green in the state space and saying this is crazy let them try this okay let them try to do go forward okay from points and do approximate the cost of goal the way I showed you okay and keep doing this and then you get some local controller okay then pick another point and do the same okay so that is the forward violet originally maybe squad real time that I'm programming a fiscal interative learning amp is not real time but simple idea right very simple implements ten lines of code your address time for our healer and you always skip 4:30 okay so been a generalization been said never used at work every time generalization even I yeah let me go because I'm using generalization here so the idea is the for let's assume you have every time I show this video and race people ask good question which is okay what if the car you put the car that way around does it work no relief as a new track does it work no you have to learn the policy again okay it's costly that's not much you can do it was trained on that environment model okay so went back and say okay how can you what the the problem years the folio you have same task t1 TN the only thing that changes between t1 and TN is the environment so the environment is parameterized with this theta could be some image feature could be some tracking points could be some curvature of your road okay and so what I want to do I want very soon so you have execution of a task given a parameter the environment which you have stored okay you have a set of states and the policy of the states and I here's my generalization definition I'm given stored execution from parameter theta one to theta n and I want to I get the new parameter theta n plus one I want to solve find an execution of the new task which is same as before just the families change exchange in the environment so think about racing I know how to raise ten racetrack and erase that eleventh racetrack okay so if you go to a go-kart the closest one to you when you take a $50 class they'll teach you these rules okay this is the way you race you break to the maximum capacity or breaking point okay then you make your visions you have to look at the optics point you turn in the car to the turning point pointed up X for the ideal race times you have to cut you have to be able to reach this point here then you begin to reduce you after the objective introduced the accelerator then you open up anyway this set of rules which are parameterized a function of the environment variable which is your curvature okay so I've been thinking a lot about how to actually have this higher level strategy in coordination with NPC you can so this would be in this work of Sherlock's just a nice paper she's reading a nice paper for CDC and so dot point here is that going back to generalization doesn't mean anything but if you have an environment parameterization and then you can pose an interesting mathematical question if you bound the parameters of the environment right there's no wall after this you know that if you're raising and nobody wants to kill you they will not put a wall after day so here's the data is quite simple you collect data from all your task execution with different parameter different parameter set so you have a lot of raising tracks and you you we have learned the way to race the track okay then you train a model of your policy in a reduce order space okay it's a lot like Gaussian process I'm not crazy about them but any any process you want you they cheated you can have bounce and look at London under discussion and then so okay sorry let me let me go slow on this you one day one you the strategy goes from your parameter predicted parameter to a set your terminal set and set of constraints so here the key thing is that high-level strategies if you look at racing if you look at number of games the people play online are in terms of terminal set and constraints on inputs and I claim that by we recording recording recorded trajectory you can actually figure out for the desire level strategies so it would be in a in a race car but your car is here your curvature now is bigger than 30 degrees you should try to reach the apex what's interesting about this is actually the if it's a task like a minimum time task mostly these rules are actually rules where constraints become active accelerate to the maximum it's fine the ethics this is actually connection with there is a nice recent work from with CMOS on the voice of optimization where he's trying to find rules by looking at set of active constraints of a mixed integer optimization so as we write newspaper we found this nice article from birth SEMA so highlight you have a strategy which is this terminal set and terminal constraints in a higher dimensional in a low dimensional space you have a strategy manager that converts this into invalid sets okay so here is your car this is your invariant set here is your car this your strategy tell you to go in this region the environments that transform this region the XY space into an invariant into the seven states of the car okay now this environment is then set as a terminal constraint to your npc and you keep running the strategy because what will happen is that at some point you are here the strategy we say well now you have to point to this region so you have a shifting environment set over the horizon come so that's a simple idea that we are using where this high-level strategy the strategy manager confusing parent and the runs NPC the Lola the points that NPCs so we desire strategies are easily explained in terms of terminals that you want to reach and constraints on input and that's exactly what NPC handles become as simple as this today's your parameter okay your environment so you're gonna save if there's a there's a curvature here for this type of curvature you need to reach the apex at this point so the the theta is the set of parameters over the horizon from time T to the you predict you look at you have a vision system you look at what your curvature is this curvature will tell you the strategy to follow the strategies transformed into invariant which are run by an MPC so here you know at the medium called this DC say aggregation this aggregation is a higher-dimensional this is a lower demand this would be on the x and y and this guy here works with XY heading lateral speed okay and so you don't it would be very very difficult to find strategy in high dimensional space okay I want to spend my last five minutes to tell you a little bit about my quest for applications of my trying to you have seen a lot of racing so I'd done I don't think on a few more race what's interesting about this this is how if you take a legal dynamics class for gratis that's how you compute are you solve the race problem you approximate the tire nonlinearities you ran some experimental data and you get this F then you compute the race line with dynamic programming with a big online optimization nonlinear program online optimization then a problem is heart is hard because it's in a dimensional space no linear optimization is hard because the long it's a long horizon and it will not give you the rest line and give you a local optimal solution I was telling a March that I have tested there is a version of Cassidy there is people that put Cassidy completely wrist wrist line with one single optimization problem super fast I was impressed by the solver and then you basically do pass folding so that's the way to do it and I'm saying it's a totally fair I'm saying run this ok I just run these in the car this is your indicator function - can't do minimum time I'm putting the cost and the terminal set the way I told you and Duloc a linear regression so it is think about this I'm think about this two different words okay and you know the I challenge my student the same code has to work on the little car has to work on the big car doing the loop on different layer and on basically on any car otherwise for learning is he doing a QP yes this would be a convex or convex horrible that the points that you have recorded the expressive size of lambda yeah a negative function I yeah it's just some it's lp3 linear these and this again now for all we're talking about this morning uh but there's something to think about what's wrong with this there's nothing wrong I mean I as a matter of fact you know treaties will be pretty small about learning yes this is cool this is clean this is one thing but if I am a project manager I want to have a team that knows diagonally I this that makes a model my car I want people that are able to compute race line I want to have a low-level control it follow this race line so anyway so any just something to think about this I'm excited about this but still yeah I said I'll skip the rail I skip this big car you've seen a lot of car racing ok do you need I have everything you need this really all these elements you need the safe set marinesko missed you you know if you sorry many ask amiss if you don't use the safe set you're gonna go off okay and this is this is clear do you need to predict to learn because this is your safe set not used you need to predict learn you do if user once that prediction you'll never improve you don't explore okay the unit or a predict once you have learned not because you can use here's the policy okay you solve the lambda okay minimizing your cost to go okay and then use the same lambdas we did restored use that's how the policy is implemented okay simple as this this is the base and the approximation this is the policies now the stored use at time I interchange a with the lambda that you have computed okay my last two minutes are going around and trying to see you needs all this learning and unfortunately not being very successful you know I am Not sure besides defense applications how it is you know from having a pro to have in order to have a product that keeps updating itself so it it's hard sale okay really not need to have a good say I could sell import so my biggest success is in solar trackers these are next record acquired my company bright box I was doing cloud computer optimization these are a big solar fields they each one of these independently actuated so on on you know you will have hundreds or thousands of PID that move these trackers to track the Sun again they're all remotely control over the cloud okay and so what we did we actually close the loop macam and you can close the loop and you what is the problem here the promise that is this well you've seen this picture you when there are clouds you know when attract the okay because actually it's more diffusive like the direct light okay and so the question is how much diffuse versus direct light there is so now you have two options even even have one have partial differential equations of the clouds in the sky or you can put a camera and do some machine learning to figure out to correlate you would you see with your diffuse fraction okay and that's what we did okay and the trackers are just in real time you know stealthy there's a million PIB for a gigawatt next likewise 50% of solar power plans around the world and what's oh this would be this a beautiful commercial it's it's about machine learning and AI so you cannot hear this but anyway this is this is our motor this is a little panel here that's a cell each one of these trackers are self powered dot point here this is a it's a cool application simple okay are tracking the Sun you're adjusting as a function of this dfi that comes from your machine learning algorithm and six percent improvement a twenty gigawatt fleet it's a one point six gigawatts what wow these are all the biggest points where we have our solar power plants these the next thing I want to change here you buy every time you buy a car that comes with this sticker you know always written here do you know that it is thicker is a name right tells you your mpg your fuel consumption and you know how this is measured this is on a dyno test for a profile it's nothing to do the way you drive and it's nothing to do with the profile of your city okay so I want to make sure I want to I have 20 years to do this so that you buy a car you upload your Google trip for they have done they've got 20 and they'll tell exactly what that car will use on your route not only will you will give you a powertrain controller that works for your route now you have each one of these cars the parroting controller works for an average route okay we have done already tests okay this is Fremont to Berkley there is a Facebook here and next wreckers here going back and forth and this is a profile of the highway and say usually same time you can see some patterns the point here is today this is a hybrid car you the the way you'd use the deplete the battery goes in charge depleting charge to sustain you can actually change the powertrain control it to be 10 percent more efficient okay and by knowing that you're going on that route okay and the way by using a cost to go function that is computed on the speed of the drought Sims you can do with arterial you can learn we have five cars running actually now in a arcadia close by you can learn the traffic light patterns and you can get you know up to 20 percent improvement if you this vehicle is running in Arcadia's five minutes from here talking to the traffic lag and learning the traffic web I'm a kid you know it's 20 minutes oh you're right I forgot I don't go there same thing with routing you can actually yeah so what you can do it routing instead of using your Google route you can actually give your power train model you tell you what is what is the best route which is most energy efficient gets you there at the same time with less energy giving your powertrain model this is all cannot be learn on actual traffic patterns okay 61% arrives better than 10% okay I showed you this rows of constraints position learning some to interesting problems I think your students should try out I have two things that are open problems one is I've been I've been we have been trying for five years to raise two cars vision-based it's very hard they can do it okay I mean I we have videos sorry for ADD give me the videos they don't look so cool okay it just there's limitation in renting the mapping and yeah I think with from a visual point of view we are not still there okay so I'm blaming the vision not the control guys it just it's very hard to localize yourself and localize the car in front this is just the learning part but it anyway it's not you're still it's still an open problem that one is when I when everybody's going to use my Ogron for doing eco routing what's happening so network cool it so the the impact of active reading is actually there is a nice article from Jane on I Triple E magazine the impact of active reading section when everybody starts learning what happens a scale is actually an interesting problem would yeah thanks [Applause]

Info

Channel: Institute for Pure & Applied Mathematics (IPAM)

Views: 3,506

Rating: undefined out of 5

Keywords: ipam, math, mathematics, ucla, Francesco Borrelli, UC Berkeley, sample-based learning, predictive control, LMPC

Id: ZFxmVDBYyDY

Channel Id: undefined

Length: 47min 41sec (2861 seconds)

Published: Thu Apr 09 2020