Model Discovery for Dynamical Systems

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
so I'd like to start a series of lectures on model discovery so the idea here is to actually measure a system and try to in back out or infer what the governing equations were that produced that time series data we're going to start off with series of lectures first we're gonna look at thinking about model discovery for differential equation systems then we'll move it to partial differential equations so full spatial temporal systems and then finally we will look at alternatives when in fact we don't measure the full state of the system and we start relying on what's called time delay embeddings to discover this and discover latent variable spaces so the idea is pretty simple and kind of from a high level we just want to think about a differential equation and so that in general I have at some the only differential equation and I'm also going to collect data from that differential equation it could be noisy measurements and so forth so but either way I'm in a time series data for some vector Y as a function of time and the whole idea here is I don't know what F is so what I would like to do is come up with an algorithmic way to figure out can I do something about a discovery process where I can determine or back out the most likely f possible that would come from this from the data itself so of course we can start thinking about this and think about all the equations we typically write down in our classes for modeling different systems and we always prescribe an F so for instance if you do some dynamics of electromagnetic waves you might write down Maxwell's equations or if you're looking at quantum mechanics you write those in the Schrodinger equation so all these scenarios I tell you what F is so that you can run a forward model maybe with some per Merit parameter variations to discover the dynamics of that system okay but here are the ideas I now have maybe a Bank of sensors to measure a system and from those measurements alone I would like to determine the f where this is really practically important as things like by a lot biology where oftentimes we don't necessarily know the governing equations at some macro scale but we have a lot of recordings what we'd like to do is infer some kind of governing equations or the dynamics of that system so we're gonna do here start with a very simple example and the architecture and from there we're going to build forward into discovering differential equations so the math behind this is that we're gonna basically take this and simply turn it into X equal to B that's going to be the goal and then we're gonna solve ax equal to B okay and there are certain things we want to do to regularize the solution to X equal to B to make this work so that's we're going to talk about in this lecture so first of all what we can start to ask questions about is what are the potential forms of f but it could have so for instance if this was a single degree of freedom system why is just a state of the system it's not a it's not so for instance a vector system where you have multiple components of the state or n dimensional system just Y you might say well F could be you know what are the possibilities F might be contained in something you know it could be Y Y squared white cubed so polynomials those are always good guesses we always have a lot of models that have polynomial dependence you don't have things like sine Y cosine Y so forth so these are all candidates sorry let's get this rid of this these are all candidate functions I don't know which ones but could be but this is only limited by my imagination right so I'm going to propose that this is made up of some potential library of functions okay so there's my library I construct it the main thing you're going to assume here is that for any model I measure there's only a couple of these terms that matter consider the pendulum for instance the pendulum y double prime equals sine Y right so there's one right hand side term and of course we break that down to a system one by once and the idea is that the right-hand side is made of a sparse number of terms from this library now this idea of sparsity is really important and it's going to start framing what we do at ax equal to B in other words when I do this what I'm gonna do is I'm a solve ax equal to B I'm gonna regular eyes by promoting sparsity and when I regular eyes by promoting sparsity it's gonna sub select a small number of the terms in my library to construct the dynamical system okay so let's make this more formal and let's see how we can turn dy/dt into X equal to B so first of all I have the library I have time series measurements of the system and so I'm going to start framing what I can do with this so here's what I'm given now let's call it T of n and I have a bunch of time measurements so I'm given a bunch of time measurements of the state of the system okay and so what I could do with this is I can say okay given that then what I'm going to do with it is I can compute dy DT right here I need this right I have I with Y as a function of T I can now approximate dy DT so in fact let's go ahead and do this so you can say compute Y dot now differentiation is an important process here what we're going to do in the example that follows is we'll just do finite differences but really one of the keys to success is of this method is to use much better techniques or differentiation than finite differences so you can use friends in total there's some derivatives and at the end of the lecture or at the end of this thing there's a there's a set of code base that will allow you to download to do a more robust job with this okay so given Y dot that means I have this here okay I have the left side of the equation so I'm gonna call white dot my back to be okay and once I have Y I can also now construct libraries of the dynamics so let's make a matrix theta of my libraries which contained in this are all the possible functions that I have for it this is a single degree of freedom system I could put Y in here Y squared Y cubed sine Y something like this so this is a matrix and each column of that matrix is a unique term that could be part of my right-hand side I could have Y y squared with Q limited by your imagination so you could put in a lot of different things there I could put a thousand library elements in there if I wanted to okay so those are all the the ideas that I'm going to put in the library all the potential right-hand sides I think could be making up F okay polymers are always good choices part of the reason we think about polynomials if you look at a lot of the governing equations we have in dynamical systems that pull on the right-hand sides okay but also sines cosines Exponential's you could put in things that you like in there okay this is going to be my matrix a so each column is a distinct potential right hand side term and each row is a time snapshot so Row one is at time zero wrote or at time N equals 1 Row 2 is a timing equal to time n equal 3 all the way down so generally I've take a lot of measurements ok so generally this is going to be a tall skinny matrix because I'm going to take maybe 10,000 measurements in time and maybe I have a hundred library elements okay so this would be a hundred by 10,000 okay tall skinny matrix the only thing left then is I've a I have B is I want to say that my right hand side then Y dot is equal to theta times some vector C so I compute y dot here it is so it's a vector B right so that's my vector B this matrix here a and then this would be the X so ax equal to B and what C is here C is the weight of each single one of these terms it's unknown I have this I have this so one easy thing to do is just simply oh well see is easy to determine it just the library backslash y dot ax equal to B and I could do a pseudo-inverse okay now remember this is a very large over determined system because it's a tall skinny matrix the other thing I could do here is I know one thing about C out of all these libraries only a few should matter so I may want to solve this X with B problem by promoting sparsity and using lasso for instance I could do a sparse regression with l1 and just solve X it could be with lasso this is the entire structure it's actually pretty simple right you just say I have a dynamical system measuring I put in some potential library of candidate functions and so if you give me the time series measurements I can compute their time derivatives I can compute a library of potential functions ax equal to B it's that simple okay a couple keys to this there'll be some references at the end of the lecture or at the end of the on the website that will direct you to some more sophisticated things you do with differentiation but you do need good differentials good differentiation schemes so again something total varial derivative is a good way to go and you need a sufficiently robust set of library elements and also hopefully lots of lots of measurements in time but if you had that you can do a regression and get this to work okay or I'll show you how it works so that's the basic mathematical architecture model discovery using ax equal to B and it really is that simple I'm gonna show you example here on the van der Pol equation and you'll see how this thing all works out and how you would construct this in practice okay so a particular example we want to consider to just execute this as we need one good example so I'm going to give you the van der Pol oscillator which is a non linear oscillator and in particular I'm just going to give you the time series measurements of the state of the system and your job then is to back out what was the differential equation that created those times responsable of creating those time series measurements okay let me finish erasing here and we can go to MATLAB so erasing is bit challenging in this board alright so the van der Pol oscillator let me write down for you so what it is it's an oscillator with a non linear damping there it is so that's a van der Pol equation and really quickly make spray it to the damping depends upon the amplitude so if the amplitude is above 1 then this is positive this acts like a damping if it's below 1 then this is negative and it acts like a gain so what you find in this situation is if your if your amplitudes large it decreases to a limit cycle if it's low it increases to a limit cycle and it comes on to some stable limit cycle behavior I'll show you this and in fact as you crank up mu it becomes highly nonlinear and as mu is small it looks like a perturbation of a regular linear oscillator now we can always rewrite this as a two variable set which is if you say x1 is equal to Y x2 is equal to y Prime then what you have is x1 prime is equal to x2 x2 prime okay is equal to mu1 minus x1 squared x2 minus x1 okay so that's our system so we're going to measure X 1 and X 2 and time and we're going to see if we can recover this set of equations here ok so that's the goal it's a simple problem we'll run a simulation of it for you to look at what what this thing looks like and once we have that we're going to think about now if I just gave you the data would you be able to recover this ok that's our first intro to this kind of model discovery platform ok so that's the that's going to be the goal let me see maybe I'll leave that up there and I will bring up MATLAB ok so we're gonna start off with some code up that's gonna cover it up so let me just race and in this code what we're gonna do is first build our data set so of course this is gonna be synthetic data but really what we're just trying to show in this lecture is a proof-of-concept can this thing actually work then will noise up our data see how well it works and see how it breaks down okay as with anything if you add enough noise nothing will ever work so that's just a general principle in life right just keep adding noise things break and that's no different here but there's a range in which you can tolerate some noise and it will still work quite nicely for you so first of all let me show you the right-hand side here it is oops it is okay this is the right-hand side function for the Lorenz equation I'm going to show it to you now and then we'll just work on the equation so the right-hand side is X 1 dot was equal to X 2 X 2 dot was equal to mu 1 minus X 1 squared times X 2 minus X 1 so it's exactly what we just broke down there so that is the right-hand side for my OD 4/5 solver so now you've taken a look at it you can see okay there's van der Pol parameterize by Mew which I send in here okay and so now let's go ahead and build Billa code that's going to call on that alright so first we're going to do is we're going to define some time points so I'm going to say DT is going to be 0.1 and time is going to go from 0 steps of DT to 50 so what I'm going to do is tell it what kind of snapshots I want so if time is point 1 who I'm going to get out as a time series with 5000 points okay and those 5,000 points is what I'm going to use as a time series measurements to try to see if I can back back out that van der Pol oscillator my initial condition let's make something up 0.15 and mu which is the parameter telling you how nonlinear it is effectively lets me at one point to have a time I have an initial condition I have this parametrizations equations now let's go solve it so I'm going to run OD for five to generate the data and so I'm just going to give you the time series and see if you can recover things okay so T Y is equal to OD for five recall right hand side dynamics send in our time send in our initial data then here would be setting in our parameters for for the how accurate with the OD for five to be but we're it's gonna leave the default and then mu so that's a no default solver you're gonna go call this call the right hand side step it forward in time give you back the data at every point a1 all the way to time fifty so that's our what we have and let me just show you what this looks like plot ty4 component second component line with to you oh gosh okay okay so let's run this down we run this oh I think I have a missing something let's check what MATLAB is complaining about oh not that one sorry here okay what did it say oh did I sing it here let me just double-check my right-hand side actually I might be in the wrong directory oh I'm in the wrong directory okay so let's make a new right-hand side function so new so this is going to be the right hand side function that was in a different directory function right inside equal to right hand side dine T X dummy mu that's what I sent in and my right hand side is equal to X 2 mu times X sorry mu times 1 minus X 1 squared times X 2 minus X 1 and then close that off and there we go save save okay now I should be able to go back here and run this okay so now we're running this code and we run it out from time zero to fifty and here you go so this model is something we're going to use quite often throughout these examples because it's it's sufficiently simple yet it kind of gives you something interesting in terms of dynamic so here's the Vanderpol oscillator so you can see that it has sort of this exotic time profile right but it's clearly periodic so it's a nonlinear oscillator and so here's the challenge if you did not know this came from Vanderpol and I just gave you this time series data would you be able to recover the Vanderpol okay so this is a model discovery to the regression now I've made the comment this is just X equal to B and that's exactly what it's gonna be when we do the discovery so that is our data that's what we're given can we back out the actual model so let's go ahead and try to do this all right I'm gonna kill that window and here we go so so put some of these down here alright so what we want to do now is we have these two variables x1 is equal to the first column of this thing we have X 2 is the second column of Y and I label them particular just so we have X 1 and X 2 in our mine row or trying to cover 2 X 1 dot equals next to dot equals that's the frame when we have it's two degree of freedom system and we're trying to discover the time with the time series measures from each what the equation was now by the way I'll make a comment that in the third lecture of this series we will address the issue of what if I only have X 1 and there was an X 2 but I don't have measurements of it in other words the latent variable what would I do then and we'll talk about how to handle that then and again this is all on a math 563 website so if you want to get there there's a link on the YouTube video towards that website which has all that information ok so there you go X 1 X 2 let's go discover some equations so that's what I start with data points and now what I need with this data is I need time derivatives right so that was part of the issue as like it's y dot equals some library times C so I need to compute the derivatives of both these time series data so a simple way to do this is say well what's the length of these vectors with same length as T and for J equals to 2 n minus 1 what I'm gonna do is use a center difference formula just I'm just gonna do rise over run simplest thing you can think of for computing a time derivative right which is just slope rise over run I'm gonna do that and what I'm doing is for all the data in that time series except for the first and last point so when use the center difference formula which says it's the point in front - point behind or - delta T rise over run that's going to be my very simple and rough approximation for a derivative ok but on my 10 steps 4.01 so I have a very fine resolve to have no noise so that helps a lot later we'll add noise up here and see what happens ok so what I'm going to do is compute these derivatives so X 1 dot is equal to what you do in fact this is the J - first point notice I'm starting at 2 so I'm starting at the second point so that's why I subtract 1 here so the first point of this subtract 1 is at 1 and then 2 all the way up so I I differentiate all the interior points and what this is is point in front X 1 J plus 1 - point behind X J minus 1 right divided by 2 times DT X 2 dot s JM s 1 equals x 2 J plus 1 minus X J minus 1 over 2 times DT there you go first turbit of reach so now what I have is the time series data ok so I'm ready I have that so that's that's my in ax equal to B I just computed B now I have to compute the matrix a the matrix a is my library of potential right hand side functions ok so let's go make some alright so what I want to do is remember I'm now throwing away the first and last point right because I did that with differentiation so I want to make two new variables X 1 s which is X 1 the second to the in line minus 1 point and X 2 s which is the second 2 and minus 1 so I throw away the end points because I don't have derivatives there I mean it could easily get them but it's not worth the time to write it just just take this internal states of the system there are my time series and let's not make a library now in the library of possible functions a is equal to well what could it be well how about it depends upon X 1 what it depends on X 2 what happens if it depends on X 1 squared or X 1 s times X 2 or X 2 squared okay or what about the cube in fact I'm going to go and do all polynomials to degree 3 so X 2's squared times X 1 yes and then X 1 s squared times X 2's and an X 2's cubed so in fact let me just do it continue here with a dot dot dot so you can see it all know like I said they didn't do a very good job cutting it off how about here I'll do a dot dot dot there okay I should get all the terms so almost all the terms are there okay I mean make it just slightly bigger okay so here's my library and in my library I have pollen oils up to polynomials up to degree three I have depends on x1 or x2 depends on X 1 squared or X 1 times X 2 or X 2 squared or X 1 cubed X 1 X 2 squared X 1 X 1 squared X 2 or X 2 cubed so again this is just illustrative for us right I just said what about all these polynomials up to degree 3 and what you can see here is I have quite a number of terms like some of my 9 terms right so 19 possible terms you can even put a constant in one right so those are possible libraries of 10 terms and what I believe is that only a few matter so the idea is to do a regression now and we can just and by the way there are 5000 points so this matrix has 5000 this way by 10 this way okay it is a very tall skinny matrix and I can easily solve it with the pseudo inverse or the backslash so let's just do the backslash so x1 then my discovery of of x1 is equal to a backslash x1 dot my discovery x2 if you click a backslash x2 dot oh and by the way I have to lay the x1 x2 to make um column vectors so I just did that there there you go I solved it I told you is as simple as x equals B there's the backslash say x equals B as I promised okay now we're going to do with this x1 and x2 is we're actually going to do look at what happened with the solutions to this so right so we're going to do a plot here Bar x1i xi1 and subplot 2 1 2 bar X I - okay so what I would do here I'm going to go through this build my library solve the equations take the derivative build a library backslash simple as that all I'm doing is X equal to B and the question is what comes out of this what comes out of this the bar will tell me which ones are nonzero I'm going to look at what we get okay so let's run this thing so there we go Oh something crashed let's just double-check all right oh there we go I made a mistake here there that's not x1 and that's an x2 okay right again click on it no it still sorry apologies for the errors but you're seeing live coding okay oh yeah so I have to differentiate this and I think right now so let's see here so x1 dot okay after I have to take these and I have to make them oh they already column vectors right I think I think hold on one second here so x1 dot got a column vector I already know it's a row so I have to lay it on side row but x1 okay so x1 and x2 I think up here I have to do no those are column vectors let's see what's it complaining about on oh the one okay all right for the moment you know the I this this is not the right size so I'm just gonna take out the constant you could put a constant in but you have to put it in as a set of ones okay apologies now let's run it next to dat who don't know gosh okay lots of mistakes here we go and there you go alright so here is the result and this is the fairly remarkable thing in fact let me make about this size and I've come here what I'm gonna do is scroll this up so that we can see what the a was here while we look at the picture here okay there we go so here is the result I did the backslash what did it say what x1 dot should be x1 dot is equal to here's the value of the coefficients and they look to be all zero except for this one and what is this one well it's the second one in which is x2 so X 1 dot is equal to x2 it got that one right right it clearly says x1 dot equal to X 2 then we can say well ok let's what about the next one let's what what this next equation say it says X 2 dot is equal to this is X 1 X 2 and then this term over here right is it's the eighth term and the eighth term is x1 squared x2 okay and if you remember that's exactly the Lorentz oscillator the Lorentz lost later let me write it down just so we have it was X 1 dot 6 2 X 2 prime is equal to mu1 minus x1 squared x2 minus x1 okay so that's what we had that was our exact system so the first one gets this the second one let me just fold this up a little bit so it's out of the way so the second term you have a mu you have a x1 squared x2 that's exactly that term here you have a minus you have a Mew x2 which is this one here and then you have a minus x1 which is this one there got it just with a backslash and by the way the suit that the backslash when you have large overdetermined systems automatically produce most smartly to its qrd competent as this so we recovered this not only do recover it let's take it a little bit closer look at this thing in fact we recovered it but there's a little bit of something here that you should see in particular let's zoom in if i zoom in here okay these are not 0 Museum a little bit more see it there they're tiny but they're not 0 okay so oftentimes as you can do is something like this is just threshold amount you say if you're if you're below a certain level you throw it away so one of the ways to do something like this is to do Li square fitting or backslash and throw away small terms you do what the backslash is easy and then you can say oh I'm a threshold so those out because what dominates is this here and this is in fact if you see here that's one it's at a value of one which is exactly what it should be okay now the other terms same thing let's go zoom in if i zoom in down here you'll see none of these are actually zero they're just mostly zero oops there we go see if i zoom in further there see and then are actually zero but you have three dominant terms which are exactly corresponding to these terms here okay so in fact let's look at the coefficients we got so for this one here if I look at what this is if you look over there this is one point two it's right about one point two which is exactly what I picked me to be so as mu X 2 which is 1 point 2 so it nails that term here it's a negative one you can see this negative point that's exactly what the value should be and for the other term here it is and again same thing it's that negative one point two which is exactly what it should be here so it's actually quite remarkable all we did was took the time series data build a library of potential terms which is a took the derivative which is B we said ax equal to B backslash you can also if you want to promote more sparsity you can just do lasso directly on this and that will also zero out most of the terms in fact in that case you can actually make it so that all those terms instead of being really small are exactly zero okay so that's that's model discovery for you I just discover now D by doing time series measurements and doing it and setting this up as a framework of ax equal to B to do this now let's see what happens when we add noise and I'll just do this so what we'll do here is if you look at x1 right the X 1 variable let's go ahead and so x1 itself right is a five thousand one by one so what I'm gonna do is I'm gonna ice to it okay so here it is OOP I'm gonna say that my noise level we're gonna play around with the noise level and see how this impacts things so let's make it one for now and plus Rand and actually put noise in front of it noise times R and N and this is a five thousand one by one right same thing with x2 add noise to that plus noise times R and and 5,000 one day one I don't know how big this noise is let's just run it and see what happens when you run this right okay so first of all you see what happened here looks looks kind of uh looks kind of bad okay in fact I should make this figure two so we can see what what the time series data looks like itself so this will be figure two so we can look at how much noise we actually put on this system from running the simulation actually plot it before I put noise on right so what I really should plot here is let's make figure figure three and we apply T versus X 1 T versus X 1 T versus X 2 line with two okay let's run that okay so figure three that's see that's a lot of noise way too much so let's bring the noise down and you can see if you have a lot of noise pretty tough to pick it out in fact here you can still see that the x1x2 give you good signatures but here look at that x1 really comes on you can still get a good signature here but everybody's being lifted up so when you have a lot of noise it's really hard to spar safai and select okay that's just thing but let's just go ahead and take less noise than this let's bring the noise down to 0.2 here you go so even with the point two here's the data so it looks noisy you know but you can still see at least roughly the oscillation pattern and here your results so if you're gonna threshold and say which terms dominate if you say which terms dominate you could see clearly you got the right term here and you got those three terms there but everybody else is slowly coming up okay so noise will have an impact but you can still see you don't do too poorly okay you can actually still threshold but your thresholds gonna get a little bit bigger right cuz everybody's getting lifted off by the noise so ultimately what I've showed you is a principle technique right for discovering models right so you say I have X dot equals some unknown F if you give me the time series data my job is simply to construct ax equal to B and solve it from there that's how in some sense how simple it is and again if you have pretty clean data you have a good way to differentiate you can get pretty good models okay if you have enough noise a lot of noise or if you have latent variables there's some other techniques to handle this the next lecture is going to handle how does go P des it's going to be very similar structure just a much bigger ax equal to B problem
Info
Channel: Nathan Kutz
Views: 6,455
Rating: undefined out of 5
Keywords: Nathan Kutz, Kutz, model discovery, SINDy, sparse regression
Id: rPWUvigYXVU
Channel Id: undefined
Length: 40min 36sec (2436 seconds)
Published: Fri Apr 27 2018
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.