Dynamic Mode Decomposition (Theory)

Video Statistics and Information

Video

Captions Word Cloud

Reddit Comments

Captions

Pemex what I'm actually interested in when I think a lot of people in engineering sciences are interested in is some kind of dynamical process it's gonna be the generic way to maybe represent this now many of you have taken a lot of courses in engineering or physics or Emmy what a AAA whatever you took typically what happened in those courses is you took a textbook and they told you this equation if you took quantum mechanics this equation is Schrodinger's equation if you took enm its Maxwell's equations if you took fluid flows its navier's stokes right so we're full of textbooks to tell us different s and for each F there's a whole bunch of stuff we can say right and you can spend your whole career in just one F ok you still can't spend a whole career doing one of these s especially if you're one impales in yourself on the sword of fluid dynamics and turbulence you could just you can do that you can look at the graveyard of people with swords stuck through them and you could be the next okay I'd like to make fun of turbulence people cause it just seems like the hardest problem in the world it's really interesting but it's like wow you could die okay by the way so how is the world changing the world changes in the following way I have measurements of this system at different points in time let's call Y my measurement at some time and I'm going to denote this by K I got measurements by the way notice the way I wrote it I mean it's not clear I can measure X directly I have some measurement or a proxy for X it could be G is just 1 so yfk is executed I can measure the state of the system so part of the last couple lectures on compressive sensing was already kidding toward the point which is I have a small number of measurements of the system can i reconstruct the system right this is actually the more realistic case if you're for instance trying to understand ocean circulation and you want to do measurements on there you're not going to get a full state estimate of entire ocean right you're gonna have maybe some buoys and they're in certain places so you're getting a small snapshot of everything that's going on that makes sense ok that's more realistic if you do neuroscience you put probes in the brain ok these Forks like thing people do like you know maybe hundreds of neurons at a time now how many neurons do you have I mean I might be a little lower than yours now because again I'm past certain age but let's still say that I'm in the billions and I took a hundred recordings and I'm going to tell you how I work I'm much more complicated than that ok so but the point is you have a very limited set of recordings typically on a complex system so that's kind of the situation you're faced with you have a recording of something more than that just because you measured it doesn't mean it's the right variable ok what is the right variable we're going to talk about that a little bit in these lectures but I do want to emphasize just because you measured it oh I measured the temperature of this system or in brains i measured the voltage of what's going on with neurons it doesn't make it the right variable to characterize your system or the best variable you is a variable but there might be better ones okay so how's the world change I told you already we have a bunch of textbooks with this but now we have more complex systems we understand other kind of systems and it's not clear I know what F is so I'm measuring a new system we don't have necessarily governing equations but I got a lot of measurements okay so now I have this and I don't know F but I'd still like to say something about the system everybody'll good with where I'm going with this I don't know what F is that's where I'm are actually really trying to get to is like I'm measuring a system and I don't know it's governing equations this tends to happen especially in multi scale systems where there's all kinds of physical affection yet there's these emergent properties out of the system and I can't necessarily write down an equation for him I don't have governing first principles and so I don't know what F is but you know what I still want to do I want to predict the future I might characterize the system okay good all right you guys seem super excited about this I can totally tell could be this Friday morning you're looking forward to your parties tonight that's what 20-year olds should be doing not getting up at 6:00 to workout at the gym ready for parties tonight I'm supposed to encourage that right or not so we're gonna talk about a couple ways to head towards this the first methods we're going to hit and it's going to be happening here in the first set of lectures is what's called dynamic mode decomposition or D and E D MD I'm Dino okay not quite guys good okay nobody a see do you see pain come on Angus died this year you should at least give him some love don't know manga says either it's so sad come on you know ac/dc Colin come on give it don't don't act you're gonna tell me you don't gonna go home and go to bed again alright they transcend the nineties they are like okay I'm gonna like just can we like set it up so that my theme song for the video is like a little ac/dc back in black no response alright so DMD we're gonna work on that and what it's gonna do is that's the first thing we're gonna do the second thing we're going to do that's going to be next week is what's gonna do we're gonna try to actually discover non-linear dynamics and that's going to be what's called sparse identification of nominee dynamics so cover that next week but here what DMV is gonna try to do it's gonna be very much like pca in a way which is gonna try to find low ranked structures and we're gonna basically try to say well i'm gonna not i'm gonna have a hard time getting an ohm in your dynamics but my goal will be as i take my data maybe i can approximate this with a linear dynamical system what i really want to do with that linear dynamical system is to say I want approximating it with the best possible linear dynamical system when I say best possible that typically means I've done some kind of regression somewhere or an optimization right so I when I say best that means there was something I did best to and that is going to be a least square fit to the data here okay so let's talk about how we might be able to accomplish such a feat and by the way why would I want to do that so let's talk about this system here this is my goal and I'm gonna drop the vector sign so just I put my I push my Bold on my pin so you can all see it right alright so here what I have is a system and typically what we think about is X is some high dimensional space so X you know exists in some big so when you solve a partial differential equation like nave your Stokes or you measure brain data right all the points everywhere I just put it all into a vector so fluid simulations this would be a billion degrees of freedom okay for that's why this kind of can be helpful for you so oftentimes this thing will be just huge so let's draw this so just imagine it's very very big ok so this is the state of the system okay so we're not looking at a two by two or a three by three differential equation we're looking at something that's massive typically alright so what are we to do with this well the nice thing about a linear system like this if you remember back to your undergraduate differential equations there was a very simple way to solve this which is to say hey let's let X be ve to lambda T the whole differential equations course was just say let's just throw in an exponential solution it's all you learned in that class if you actually go back to it you think I learned all these things no you didn't you learned one thing we just repackaged it to try to make you think you like learn all kinds of things but you just learn that and you shove that in there and what do you get well back I can value problem that's it and then you said oh great so I can get I can solve for I can devise and eigen functions out this thing right and let's go ahead and say that I have some eigenfunctions here so my eigen functions let's call them P of J's and they each have an eigen value lambda J so presumably you guys know how to find eigenvalues and eigenfunctions okay and that's all you do as you say well okay so take this to the side I want in this case the determinant to be 0 the determinant is 0 gives me a characteristic equation to determine determine the lambdas H lambda I get an eigen function okay so that was actually the whole point of differential equations as an undergrad and the only thing we could actually solve in that class for the most part was constant coefficient differential equations and then once we found solutions we could force it and maybe get a particular solution to glue on to our homogeneous solution and that's the whole structure of the course ok so I just I summarized in five minutes what you learned in whatever 10 to 15 weeks ok that's it so the nice thing here then is once I have this these represent a basis that I can now expand the solution then in particular remember that V is this here so my solution starts to look like the following essentially it is a some zombie of j v of j e to the lambda j t j equals 1 to whatever I have an exact solution so that's a kind of the awesome thing about linear systems why do we like linear systems do I we always worked so hard to try things make things in linear is because I can write down a solution on general techniques for solving linear systems dynamical systems the minute you make it non-linear there are no general techniques every single non-linear system may have a different strategy it's a little bit like think about solving a linear system of equations if the determinants not zero there's one unique solution as soon as they give you a nonlinear system of equations there could be no solutions in infinite solutions 13:5 there's no general rule every single one is different you see this is a problem because you're actually trying to solve a system say I kind of know have to have some confidence that I didn't miss 15 of the solutions that really maybe matter linear systems you can say everything about them and you have a general solution okay so this is why we want to do this approximation tool in your dynamical system everybody good with that all right make things linear it's just like a good principle in life okay you have a problem and if you can make it when you make it linear it'll solve you help you out a lot okay so this is going to be our dynamic mode decomposition idea the question is how do I actually find such a system from data okay remember part of the objective here is what I have is measurements of my system I don't know what F is so it's not like I can get a from F I don't know what F is all I have is data I'm collecting okay good that's going to be our constraint all right so let's do it so for the moment what we're gonna do is zoom that I actually have access to full state measurements so I can measure X directly okay so here's what I'm gonna do I'm gonna show you how easy this is in the way I'm gonna say I can measure X at any time point so remember X of j is the state of the vector at the J time point okay some high dimensional system typically it's high dimensional doesn't have to be okay all right and so what I'm going to do is I'm going to make a big matrix capital X the columns of capital X are going to be time snapshots of this thing okay so for instance is going to be x1 x2 all the way to X of make sure I get this right M minus 1 so I just sit there and measure the system and it's snapshots okay got my thing measuring yeah how its evolving in time I'm gonna make another matrix X Prime and what this matrix is it's going to look almost identical except all the columns are going to be shifted over so I'm going to start the first column with X 2 X 3 all the way to X of M this is a purely data-driven way to get this system all I gotta do is take measurements I don't know what the dynamics are I'm just gonna record and build a model how is it gonna happen well I have these two matrices which just come from recordings right and we'll talk about how long you have to record for there's there's actually a lot of different strategies around this right so this is what just like your homework there's a lot of like well how do you know how many M you should do I mean it's all a little bit of massaging at the end okay so you have to figure out in this problem specific all right but here two matrices just turn on my sensors got them so what's my objective my objective is to build a linear system a linear dynamical system right I like my goal is to build this and the way I'm gonna do that is to say like well basically if I want to build this linear dynamics I am looking for a matrix that takes me from where I am to the future state so one way to think about this is I want the best matrix that when I multiply X it takes me to X Prime we will ponder this for a moment what this says is if I make if I hit a with X I get X Prime so it's saying is I'm trying to find a matrix that if I hit this matrix I come in here means X 1 goes to X 2 X 2 goes to X 3 so in other words the only difference is X 1 is advancing delta T into the future X 2 is advancing delta T into the future that's essentially what the differential equation tells you is how do you advance into the future and so what this is telling you is here is the mapping I'm going from where I am now to delta T into the future it is the operator that takes me to the future delta t I rigged it with that all right stated like that super easy solution all you got to do it's multiplied okay on the left side right side by the pseudo-inverse so the pseudo-inverse denoted by this so x times its pseudo inverse is identity and so you just get a is equal to X prime X pseudo inverse the pseudo imager is the least square regression algorithm it's called the more Penrose algorithm to computers he remembers just as the least squares to get that it's fast I'm telling you is this matrix here is a least square fit for notice okay real quick the matrix a that best gets me from x1 to x2 is different than the matrix a that gets me from x2 x3 generally in fact if I was trying to fit all these I'd have a bunch of matrices a I'm looking for the one matrix that kind of best does all of them with the smallest least square error that's the matrix okay so that is what you want to compute you take your data snapshots you build to say that is in fact gonna be the basis for saying oh my dynamical systems because now I take that matrix a I'm gonna what I do with it I do this with it once I have that I can say like oh just go find the eigenvalues eigenvectors and you's gonna be called DMD eigenvalues DMD eigenvectors okay now here's the thing there's a problem so let's leave that there that's what's called exact DMD there are other ways to get a DMD approximation but this is the easiest so let me just label this exact DMD and I'll go through the algorithm in a moment okay but it's kind of the philosophy everybody go with that justice regression okay yes okay good question the way I wrote it here they are right all you really need so these could have different variables step size between them but in the exact DMD this to this has to be all dealt eat wherever I took these measurements the FB delta T advance so this could be time 110 1138 like big but then they all have to advance delta T the same delta T okay however we have a generalization of this called optimized DMD which allows you to take any time snapshots between things that's up again MATLAB codes up online your paper is actually coming out just now on optimized DMD which allows you to have general stepping there's also a a robust DMD which allows you to do this and say yeah but what if I have this remember we've talked about we talked about robust fine your statistics what I've have a bunch of outliers that are really throwing things off so there's a way to do this so that you get rid of outliers allow from general stepping so there's also a robust DMD okay that's up on the archive interested okay so so that's the idea so now let's talk about some problems with this idea and then we'll talk about the algorithm and then we'll program what did I say about the vectors X high-dimensional so in general that matrix X is this big tall skinny and then the pseudo-inverse of this thing looks like this so when I take the outer product of that here's what my matrix is absolutely massive so for instance if I had a billion degrees of freedom I Cyrillic this would be a billion by billion matrix that I have to find out your values and eigenvectors for a stupid thing to do okay now why is it stupid there's one fundamental dogma in this class that runs throughout it's kind of a my own Dogma personally it's like all these systems were measuring Xolo rank structure that's why data science exists right now because almost any data set from Amazon databases to fluid flow to neuroscience all of them exhibit low dimensional behavior in these high dimensional systems this says nah it just completely ignores the fact that there's a low rank structure so instead we're gonna say like alright this is a bad idea to just compute that pseudo inverse and eigen values so instead we're gonna define an algorithm here's going to be the algorithm I want to compute this matrix right so first of all let's talk about our data matrix X we we may or may not get to coding so I don't want to rush the coding so we might actually push that up to - Monday I know it's gonna be super hard for you on the weekend to not have seen this you'll be so disappointed if you're restless the entire weekend it's just tough go work out you'll feel better okay but what I'm gonna do with the data matrix is big thank you yes kind of like most ant wait I don't SPD it like that I should be your generic what you gonna do this matrix I don't know that's PD it yeah it's a good idea it's always a good idea it's always the right answer usually with me anyway okay s we did and I know I can because I have guarantees now why am I doing this what I'm gonna do is I'm going to do this decomposition but right away when I do the decomposition I'm gonna think about the fact that low rank structure exists and the date I'm going to look at and so I want to start producing a model and taking advantage of the little rank structure okay so for instance if you are measuring fluid flow and you do have a billion degree of freedom system you don't want to be building building a billion by billion matrix Oh crash your MATLAB but also a lot of times just like when you looked at those Yale faces high-dimensional they all collapse down to like two hundred hundred right you can get most information out of two hundred modes same thing with fluids even turbulence to three hundred modes truncate you can represent almost any fluid flow with it why work with a billion when you can work with two hundred okay that's just that's a philosophy philosophical viewpoint I think you need to make sure to take and a lot of the data you work with go low rank with path when possible okay so that's the first step we're gonna go low rank I've already set set it up to do that okay so one way to think about the I'm gonna make this explicit to put our our our so we are looking at a ranked our truncation now where your trunk a is kind of an interesting question let's ignore the truncation issue for a moment where I would want to truncate but if I had fluid flow there would be some point which I'd look at what I do first remember we always do you go plot that thing we plot the singular values and say where should I truncate oh you make a decision based upon what this looks like okay and then you say okay I'm going to keep our modes that is gonna be the subspace I'm going to work in and build a model in those are directions that are optimal because it's an SBD optimal and all two cents okay we go with that so that's step one not not a terrible surprise given the nature of the course so far there's always an SPD involved somewhere so far right pretty much there's been an SPG lurking around and so we take advantage of it and again there's a reason for that the SVD gives you the optimal low rank embeddings so it always shows up to play a role all right step two of the algorithm let's go back and look at this thing here a is equal to X ax pseudo-inverse that's what I was trying to work with so what I'm gonna show you is that you say well okay I could if I was working with this directly I'd have to compute the pseudo inverse of this and the pseudo inverse of this right is if you take the inverse of this switches the order and then you have to take inverses of you but the universe of you is easy because it's unitary right comes you star the universe of Eve star comes v this just becomes figment to the -1 so it's very easy to compute that inverse and so this thing here becomes okay so this is one way to view what we're gonna do this so this remember so I said we're gonna work with this but for right now if we didn't this is what we'd have to compute we could do it through here but really again part of the objective here is not to work directly with a because a would be massive but to work with a similarity transform of a so I want to pop all of a down to the low rank sub space how do you do that I'm gonna work with a matrix a tilde and a tilde matrix looks like this this is a similarity transform into a new set of variables these new variables are the you variables to low rank embedding space okay this matrix a is n by n this one here would be R by R so this is a billion by a billion this might be 200 by 200 like truncated to non remotes you can decide what you prefer but I prefer this okay and in general for large scale problems so a lot of things to think about in general what's some of the problems we have with data science so let me let me tell you some of them talk about neural recordings you know they could put the stuff in the brain and they can record since some of these sensors at many kilohertz so every single second and they record for minutes or hours so just think about it I'm recording at 20 kilohertz I have 20,000 time points per second and now you give me an hour of data that's our big data problem right all right or you have a billion variables and you record this in time so you can get very quickly out of hand with data the whole reason we do stuff like this is because if you're having really big data there's low rank structure most of your data you've got to collapse it or else you're just not gonna have a big enough computer it does not take much for you to basically run out of the memory on your laptop or even on a machine cluster in your lab or even on the Amazon Cloud are you doing computing in the cloud Amazon Cloud it's kind of expensive you know just do everything on the cloud yeah and then you get your bill we did this we were like training a deep Marilyn ass like yeah this didn't be so bad ten thousand dollars are you serious I should have bought in a machine for the office of being cheaper so we stopped we're like okay let's think very carefully about what we want you why because it's just big computation this is the problem you have with data okay in general most of you notice most the problems that we're gonna be doing are sort of moderate scale they all can fit on your laptop I gave you 1.5 gigs of video but broken up into a bunch of parts so each one you can like pull stuff out then it becomes pretty small and then the SPD is super simple on those okay so this is this whole idea get yourself into a better space okay now notice what I have here then is u star au so eight times u so if I have the au well here's what a looks like U star U is identity without because unitary transform then I get X Prime the Sigma minus one this is the matrix a tilde my reduced rank matrix a that matters for my dynamics so take in my space I've done the SVD found a low dimensional space built the model there okay step three exercise and eat well to see if your panting right know do you and I in value decomposition that's gonna be our next step remember I said hey we had a matrix a you just do an eigen decomposition now we're back in familiar territory for you right presumably you've done I can function I can vectors and eigen values of different things before your now it's now on your home turf outside of everything we've done all you're gonna do is remember that works say well if you how you'd solve this you wrote down I mean values and eigenvectors and so you say like okay same thing here that's the generic form of the eigenvalue problem so you know you get our eigenvalues are eigenvectors eigenvalues dec so the w the columns of w are the eigenvectors and this matrix big lambda are the eigenvalues eggs command in matlab okay final step off the racing board huh alright alright I'll come back over here so by the way notice what I'm doing in some sense I'm defining the DMD as an algorithm and as exactly how it was originally defined people were looking at fluid flow they were trying to do prediction and they were trying and it was very complicated fluid problem and so you're saying but I have recordings of the data can I build a model that will allow me to predict the future state in some kind of efficient way and in fact I said well this talk to start start taking account snapshots of the system's build a linear model that gives you a future time prediction pretty nicely and we'll talk about that in a minute and as the dynamics changes you can just keep changing the matrix a right you can just update the model as you go along the nice thing about this it's kind of like a low commitment thing the model changes you know like no big deal I'll just build a new one I'll just I can just keep updating it as I go along because a lot of systems change on you so if you commit to a model and it goes bad you got like oh I gotta build a new model this is I'll build one for your awesome in fact I could do it on the fly as it goes along okay so that's something pretty nice especially for parametric systems where parameters are wandering on you you need something that can handle that signle step four get yourself back into your high dimensional space so essentially the route back remembers we're in our dimensional space the route back up to the high dimensional space remember you want I can function DMV modes and D in the argument is back in the original coordinate system I've compacted everything down to you but then when I want to plot for you let's say the fluids flow I'm not I'm give you I have to give you back a plot in n-dimensional okay last step of that is to go back through this process it's this here okay so that's it these are now the DMD modes the eigenvalues don't change that's just in the eigen that that's just part of what you have there so at the end of the day then you calculate some eigen values you calculate some eigenvectors the DMD modes and by the way here's what's kind of interesting I'll just point this out to you this is very important actually some important facts one when you come down to this low rank subspace and you build these vectors notice even if they were orthogonal here when you cope I'm projecting back up here DMD modes are not orthogonal there's good and bad in that when you do an SVD the requirement is that all the columns of you are orthogonal which is fine except for there's some problems where there's actually some advantage to not having orthogonal modes okay you may want to say like hey what if I had a problem again how we already talked about this from independent component of analysis right if you had something going on over here and something gone over here right SVD would say like oh I need orthogonal modes so I'll build two modes like that those are the two best modes when in fact you should probably like build one there and there I see a lets you do that so does DMD it's not restricted to orthogonal modes and there's some great advantage to that at times okay just want to throw that out there all right and now you actually have a solution and the solution is the following X is equal to this is your model DMV modes this is a matrix times B vector okay so this is equivalent to equals one two are the different eigenvalues ETA Omega KT okay so this is a matrix form so this is all the eigen modes together there and this is a diagonal matrix exponentiating there with the eigenvalues that you had and B is essentially the loading of how do I get beat I don't know B if I have this I have this these are team D eigenvalues beam promotes you say well okay what about at time zero at time zero I have this each of the zero is one Oh X equal to B because I presume another state here so this is ax equal to B so I could say Oh B is easy to compute I could compute it there are other ways to do this by the way this is actually let's say fancy stuff that we can improve on but for right now let just say if I have that like a backslash right so it's gonna be a backslash I got B put it in there and now I have X of T now what's the big deal about this the big deal about this is because I have a linear model here's the solution to that linear model I say to you I take snapshots tell me where the solutions going tell me where the solution is going to be in three Delta T's from now just put in the time you wanted it what's the prediction in the solution at time 10 put in T equals 10 there's the solution at time there's my prediction of the solution at time 10 that makes sense I have a way to predict the future without actually running a different equations model I just plug in okay that's very powerful this will only give you usually some kind of short time window into the future of a good prediction why let's talk about eigenvalues what if there is an eigenvalue here's the real part of these things my colleagues Omega imaginary part Omega what if there's an eigenvalue right like they're slightly positive what's it gonna do is you go forward into the future exponentially blow up in fact if any of those eigenvalues just have a slightly tiny bit of real part it's going to blow up the system doesn't do that it's just that when you did the regression remember this is just a regression okay it's a it's a least square fit and that least square fit says here's the best model currently and yes you have eigenvalues that are slightly over there and you're gonna blow up that's okay maybe I don't need to know what the solutions gonna look like in infinity I just need to know five units into the future now why is this important it's important for one really important reason in addition to some others but remember what I just built for you from data without knowing anything about the physics I built for you a model that currently this data seems to be obeying that and allows me to do a future state prediction and a lot of systems especially if you're an MEA a everybody has control courses right I can do the same thing with actuation so the biggest deal about control is you have to save my systems going over here I have to actuate it to make it go where I want so you can set this thing up control theory works in linear systems that's where we have all the guarantees soon as it's nonlinear all bets are off but it's linear it works and what I've just built for you is the best linear model right now which means I could potentially control it

Info

Channel: Nathan Kutz

Views: 25,416

Rating: undefined out of 5

Keywords: Kutz, Nathan Kutz, Dynamic Mode Decomposition, DMD, Koopman, linear dynamical systems, dynamical systems

Id: bYfGVQ1Sg98

Channel Id: undefined

Length: 43min 29sec (2609 seconds)

Published: Sat Apr 14 2018