Learning flow maps with neural networks

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
[Music] we're returning now to the concept of flow maps this is a second lecture out of three on thinking about time stepping and using neural networks to help us build time-stepping algorithms what i showed you last time on flow maps was this idea that flow maps are nothing new in fact you know when we think about having a differential equation and we build something like a runge-kutta stepping scheme that essentially is a flow map from time t to t plus delta t so if i know the governing equations and the delta t is small i have this accurate representation of the flow map what we want to start doing is thinking about how might we use neural networks to do or to learn this flow map from t to t plus delta t where now we can maybe even unhinge ourselves from the constraint that delta t is supposed to be small okay so that's going to be today's lecture and we're going to actually introduce a python jupiter notebook to walk us through this example so we're going to learn these neural networks directly from the data and the nice advantage the advantage that you have in this neural network representation is you don't have to know the governing equations of course you need a rich set of data to train this however if you're measuring a system in which you can see time dynamics and you do not have governing equations the neural network allows you to learn directly from the data and what we'll see from this lecture and especially the next lecture is that we can learn different time scales and flow maps for different time scale physics by building these neural networks in clever ways so that's what we're going to do so here's the generic idea of a neural network right it is mapping an input to output that's it sometimes the output is a classification of the data sometimes the output of the network is something else like the future state of the data so for instance the input layer of the neural network here will be time that my solution or my measurement at time t and i'm going to try to map that over a certain number of layers of certain sizes with certain activation functions to an output layer which will be the solution at time t plus delta t so that's the idea of learning a flow map go from t to t plus delta t and this structure in here you're using the full power right this universal representation property of the neural network to learn this mapping to a future state so there it is go from time t that's the input the output of the network is time the solution at time t plus delta t and again here the delta t isn't necessarily going to have to be small for us although what i'm going to show you in the training that i do here is just a very standard example in the third lecture of the series we'll walk away from delta t being small right now i would just want to make comparison between something like a runge-kutta stepper and learning a flow map of this sort by taking data and learning how to go from t to t plus delta t with some neural network architecture so the system i'm going to explicitly show this on is the lorenz 63 system here it is it's a very simple three by three differential equation it's got three parameters sigma rho beta and as you change those parameters around you get some very interesting dynamics including these uh chaotic butterfly tractors that are are very famous for this system so this is a very easy system to put into something like od45 fourth order cut a time stepper right you just march this forward into the future and as long as delta t is small you get a very accurate representation of the future state of the system and in fact you can even compute explicitly what the accuracy metrics are if you use a time stepping algorithm because most of these come imbued with some kind of analysis about the accuracy and error and the stability of the scheme and so we can just use that here to actually generate data for us what we're going to try to do is move away from using something like runge-kutta or euler and say could we just learn a mapping directly from data of this system so the idea is going to be is to take a set of training data so what you're looking at here is the training data what i'm going to actually do so the neural network doesn't of course doesn't have let's say access to your governing equation f what it has access to is training trajectories so each of these balls is a different initial condition they've been randomly picked a hundred of them there's a hundred balls there and then i just watched the trajectories so all these different colored trajectories here are in fact these 100 trajectories with different initializations and of course they collapse down to this butterfly tractor that you see here this strange attractor so and so this is going to be my training data i run this from time 0 to time 8 and i take a delta t of 0.01 so each trajectory right has something like 8 000 points and with those trajectory then i have a bunch of data telling me how it goes from t to t plus delta t with different initializations so i see lots of different trajectories lots of different mappings from t to t plus delta t and from that data i'd like to train a neural network which whose input is time t whose output is time t plus delta t and i want that to accurately represent what actually happens in this in this dynamical system notice the training data when i use the neural network it has no ac like i said it has no access to the actual governing equations it doesn't know about the lorenz dynamics the lorenz 63 model all it sees is discrete time points and where the solution is at these different time points and it sees how it's going from t to delta t plus delta t and it's going to build a model for that based upon how i construct a neural network so that's going to be the idea now i'm going to walk to is walk through a jupiter notebook the jupiter notebook will be linked directly in the website so you can just download that and this jupiter notebook that i walk through will be available to you to play with and you can start seeing the kind of structure that's there so let's go to it it's here all right so here's my jupiter notebook and hopefully you can see most of this you know it's always hard to get a good color scheme with a black background that makes everything work uh but here you go you know you just do some standard loading of you know numpy is np they bring your map plot lib libraries and so forth scipy and so forth so that's pretty standard now what we're going to also bring into this is we're going to bring in keras from tensorflow okay so uh i what the assumption is here is that you actually have tensorflow installed and you have keras built in as well we're going to basically build this all out from those from those models i'm actually not going to run this because it takes quite a bit of time to train and i won't do that in real time but i'm going to show you the results as we go through so here is in fact this keras model various aspects of the keras model that are directly loaded in into your jupiter notebook so that we can access them as we run this code so again this jupyter notepad all these files will be available to you so you can look at in detail the main thing i want to do is start building the simulator so let's start to simulate the lorenz system so that's what this here is aimed at doing we're going to produce our training data so we're going to simulate the system and we're going to do 100 trajectories and that's going to be our training data that we're going to use to try to learn a neural network mapping between time t to t plus delta t so here's our dt it's 0.01 we run it time all the way to time 8 so we make a t vector this is our time vector it goes from time zero all the way to time eight in steps of .01 okay so that is going to be where we mark time and what we're looking for is a mapping that takes me from time sometime t to time t plus delta t so that's the flow map we're going to try to build using a neural network architecture at first though the training data will be let me assume i have the governing equations which we do which is the lorenz and then let's run those and that's going to produce our training data here the parameters that we're going to use beta is a third sigma is 10 row is 28 this is in the regime where we have nice uh dynamics in the in sort of you you this is where you get your nice butterfly tractor uh and we can actually look at uh interesting you know if you want to look at how big the time vector is here it is print it out and then we're going to bail basically start building this data in for the training so there's two things i want to bring in here and then input and then output and an input is the neural network input so these are all the data at time t and an output is that data advanced delta t into the future so if i have the solution at time t that's part of n an input the solution at time t plus delta t is nn output so for the first thing i'm going to do is fill these with zeros right so you have to pre-allocate this so i'm going to pre-allocate the size of this thing which is like the size of the length of the vector t okay and pre-populate it and so we're going to do is when we run the simulations here i'm going to actually make a nice figure here so i'm going to make a figure of all the dynamics that you're going to see in a minute i'm going to define the lorenz equation i'm going to run it a hundred times in a loop and then i'm going to populate as i as i take the entire time series i'm going to start putting a solution at time t into an input and its corresponding delta time uh solution delta t later into in an output so i come in with input output pairs that's what i need to train so if i have these hundred trajectories and eight thousand points per trajectory you can see i'm going to have a lot of data here to provide this thing for training right so something like 800 000 pairs of points are going to come in to these matrices for our use in training right so it's 8 000 per trajectory 100 different realizations that's where i got the 800 000. okay so here's the lorenz the definition of lorenz model it's right there this is just using standard runge-kutta solver and here you go i'm going to go ahead and this is the right-hand side this prescribes the lorenz equations okay and i can bring in the initial condition the values of those parameters sigma beta and rho i bring them into here and i run this model so that's the right hand side i first define what i'm going to solve and then what we're going to do is actually solve it right down here first i'm going to make some random initial data i'm going to put in some random seed here it is just some large odd number so i'm going to make random initial conditions they're not going to be too crazy they're going to say i'm going to pick some random value here and then i you know just put in some random input those are those blue balls i showed you're going to see it right here in a moment in this slide but i'm going to make 100 different initial conditions and then what we're going to do is right here we're going to integrate those ods there it is mp array integrate ode there it is lorenz with x not j so it's going to pull off an initial condition here run it and we're going to do this for all the x naughts in here so in other words for all of those various initial conditions that we put in which are going to be 100 i'm going to basically run a trajectory okay now once we've run these trajectories right we've got 100 trajectories we've run we have to stack our data and this is what we do here i build this nnn input and this in an output so and an input is to say all right i've got all 100 trajectories let me go walk through this and say for every single point in a trajectory i take that point i put it into the input i take its next point delta t later and put it in part of n and output that's what's done right here okay so by doing this what i've actually done is i've gone gone ahead and say this is the solution at time t here's its corresponding solution delta t later for all hundred trajectories for all uh all those eight thousand points in that trajectory so that's what i've done here okay and you can just follow this through but it basically tells you exactly how to stack that data together and uh and and then you're set to go this is going to give you your training data okay and then uh we can also plot these trajectories here so we not only plot the trajectories but then for the initial conditions of the trajectory we put a little ball there to see what it looks like so again all this code will be available to you and here it is this is what it looks like this is what you get out of your in your jupiter notebook this is actually runs very quickly every red dot is an initial condition and so there's a hundred trajectories here all collapsing down just like i showed you in the slides right so this is my training data i have this and what you're seeing these look like nice continuous trajectories and i have them i have all the trajectories at every 0.01 units of time and now what i've done is i've broken this down so if i in all these are run from time 0 to time 8 with steps of 0.01 right uh and so so you have these long trajectories that you can start building pairs of t t plus delta t t t plus delta t okay so that's that's what we're that's what we're doing here okay so there you go and i think actually uh for each point if it's 0.01 right so uh actually i think i have only 800 points per i was saying 8 000 it's actually 800 training data points per trajectory okay so that's my training data and now what we're going to do is we're going to actually try to see can we train a neural network after it's seen all this data after it's seen all of this dynamics can it simply learn a map from t to t plus delta t now you know this is a nonlinear process right the lorenz equation itself is non-linear right and in fact when you think about the runge-kutta method you set up to run this right it explicitly accounts for the form of the function f which is prescribes the lorenz model this has no information about that it just simply has data pairs and what we want to do is take those data pairs and see can it learn a representation of how to walk forward into the future so that's we're going to do next this is the next part of the code which is we're going to construct a neural network model and the beauty of using neural networks and in particular keras is it's very simple to do these things so i'm going to think call this thing deep approximation deep approx okay so this is going to be my approximation for the neural network math only from t to t plus delta t in other words the state of the system which is three dimensional x y z at time t to x y z at time t plus delta t it's gonna be a keras model and sequential so it's gonna be some sequential set of layers and here's how this is going to be constructed i'm actually going to just put three layers in there so i'm going to go from three i'm going to make the layer here the first layer is going to be a dense layer to size ten so it's going to go from three to ten the activation functions here are going to be sigmoids okay and the input dimension of course here that you come in with is three so you come in with three you go up to ten that's that first layer is 10. the activation functions are sycamoid i just made that up i just want to show you you could do sigmoid release linear it turns out it doesn't even really matter which one you use at least for this simple example you can train it long enough and you get very nice results but i just wanted to show you the kind of diversity of ways to specify the activation function and also ways to specify the number of layers and the co and how big the layers are so for instance this first layer again goes goes up to size 10 from three sigmoids is the activation function the next layer here it is you add this layer here that's the dot add layers dense also has 10 its activation is relu functions okay so i have a sigmoid layer which takes me to another 10 which is now going through a relu which then takes me to the third layer here which is again dense it's but now this layer is going to take me down to three and it's going to be a linear activation function okay so this is how this thing goes you go up over down three to ten ten to ten ten to three sigmoid relu linear very easy to play around with this you can see how simple it is to code and make structure you want more layers just add some more things in here you want to go to you know 30 nodes in a layer just put 30 there you know so easy to play around with this once you're in the keras framework okay so this here specifies the architecture of my neural network my input to my output right so my input comes in at size three my output has to come back out to size three as well right because i'm looking at the solution at time t what comes out here should be the solution at time t plus delta t remember my input is the solutions at time t what it believes the output are are the time t plus delta t so it's going to take these layers and try to learn the best mapping between time t to t plus delta t how is it going to do it here deep approx compile so now this is going to specify for us how we're going to do this so first of all you can specify your loss function msc mean square root error so there are different errors you can specify but this is a very simple one the loss function here is just mean square root error and the optimizer is we're going to use the atom optimizer there's various other ones some stochastic gradient descent other types of optimization algorithms we just picked these two because they're pretty standard and people use them a lot and then once you have this so you not only specify the architecture you've now specified what the error metrics are and what kind of optimization you want to use now you can train it so you go deeper procs fit so this thing here is going to actually fit this thing and again what is the data you're trying to fit i'm trying to fit from in an input to in an output and epochs here i'm going to let it go a thousand epics okay so you can in fact there's some shaded out here areas here you can come in you can specify you know more app more epics less epics you can specify batch size so many different options for how to train this thing i wanted to keep this as simple as possible so this thing here is going to fit this thing using these pairs of inputs to outputs time t to t plus delta t learn that map and here is the neural network architecture that you're using go so you run this once you execute this this will now sit there and in fact we'll run for the thousand epics that i actually told it to go and do so there is in fact all the results from that and so the question is after it's trained how do i evaluate how good this model is how good is this neural network model so here's what we're going to do then after this is trained we've built this model we're going to execute that model okay so we're going to start executing that model and this is what's going to happen right in here which is we're going to use deepaproc so remember deepaprox is what i called my model and now deepaprox predict is going to take in the input some input and the input remember is a solution at time t and map it to an output which is the solution t plus delta t so this object here is my model that i learned it is the representation of this neural network where now i can put in the solution at time t it's going to give me the solution at time t plus delta t and if i want to create a trajectory i can start off with an initial condition put it into this model which will walk it to delta t into the future take the output of that recur put it back in to get me to two delta t put it back in to get me three delta t and so forth right so you're just basically iterating on this model to produce a trajectory for yourself okay so what i'm going to do is again make some random initial conditions so i specify some random seed and what we're going to do here is run three trajectories number of trajectories three here you can run more if you like but i just want to show it here because it gets very crowded on the screen and so what we're going to do is we're going to do two things first we're going to run this model we're going to take this model that we trained we're going to take these three random initial conditions and run it through this dynamics just say run it through the neural network what does it do and so here's the process of this it's going through the loop producing three trajectories that are based not upon integrating od 363 with runger codo45 but based upon this mapping this flow map that it learned and we're going to compare that to here where now what i do is i also take those three initial conditions i produced here and i run the solution through od45 and i'm going to compare them and that's what this all this plotting is down here is here is my od45 prediction here is my neural network prediction and this is just making the plots nice making and here is the result of this so this is a very simple architecture and what you're seeing now here is three red balls there's one right there there's a second there's a third so these are three new initial conditions that were not in the training data okay so this is my test data and i asked the question if i run this how well does my model that i trained in this neural network work in relationship to just using a standard time stepper and that's what you're seeing here so the dotted lines and the solid lines the solid lines are the od for or the numerical integration something like runge-kutta and the dotted lines are your neural network prediction and notice for instance here on this trajectory notice that these two just stick right next to each other in fact they actually behave very similarly and eventually fall away from each other same thing with this one here in fact actually they kind of fall away a little bit here on this big excursion but they have very similar long time dynamics and the third one which is here actually is pretty well behaved so what it showed you is in fact i'm going to show you a couple more slides here in a second that in fact i just learned a proxy for this runge-kutta stepper where i knew the dynamics all i needed to learn this was just training trajectories okay so why is this important in the context of fluids which often times is what we're looking at especially for this course in this class it's important because in many of the models that you might have you may have a model that doesn't accurately capture some of the physics that you're seeing right so there's some discrepancy between your model and you're missing some physics effects that you might not be able to coarse grain well or c in the dynamics what this allows you to do is say it is model free if i can just see enough training trajectories i will build a model for that mapping me from t to t plus delta t which allows you now some predictive engine so now you could take this and do forecasting for the future state so that's the that's the power of these neural nets right it's just can learn this in a model free way and again but assuming that of course you have enough training data all right so that was the code and again that jupiter notebook will be available to you really nice playground for you to start kicking around some code learning how to map trajectories learning how to build a flow a flow map for dynamics and by the way just as just a highlight of another run this is two initial conditions here i just want to show you what this looks like again in a little bit more detail now that i have built this neural network architecture to learn the flow map again here two initial points track the trajectories over time and you can see the dotted and solid lines actually stay on next to each other for quite a long time now the loren 63 is really interesting you know that if you're off by even a little bit then eventually you're going to diverge in trajectories right but notice that the attractor structure is preserved for both the neural net as well as the time stepper even though they're going to eventually diverge from each other in fact here's a result of this here so here's two different simulations and what you're going to look at here is the lorenz and the neural network the lorenz the let's say the true model is the orange this is taking runge-kutta steps with a very accurate solver and the blue is the neural network and notice in this simulation they stay on each other for quite a ways until they eventually fall off here this is where they diverge which is exactly what you expect there's sensitivity to initial conditions in this so even if you're off by a little bit eventually you diverge but what's remarkable about this at a time step of 0.01 which we trained for this these two stay much closer to each other than if you were take two different runge-kutta models and run it at time step .01 they would diverge even sooner so it's telling you something kind of important about what this neural network architecture allows you to do by the way the second one actually they're just dead on for at least that time frame you go further out they'll eventually diverge as well so this just shows you though the effect of how you can say hey i just learned a map a flow map for some delta t from data alone and had no idea of what the physics was it just can take the training data learn some nonlinear mapping from t to t plus delta t and you saw how simple my architecture was right three layers put in some generic functions you can make that even much more sophisticated to try to improve these results so we're going to end there with this first step in neural network training now notice one thing here the delta t i picked is still relatively small and so we i've shown you how to build a neural network to map me from t to t plus delta t but what i want to do now is walk away from the requirement of delta t being small can we do that so the next lecture is going to really start to say now that we have an architecture for training neural networks to build flow maps can we now say let's now back up and see if we can train this flow map to take much bigger delta t steps in the dynamics because that's what would be really valuable to you is to find yourself in a situation where you can build improved simulation engines because you can take delta t's that are much bigger than the constraint of something like some runga cutter stepper [Music] thanks you
Info
Channel: Nathan Kutz
Views: 1,772
Rating: 5 out of 5
Keywords: time-stepping, flow maps, lorenz, kutz, data-driven science, machine learning, neural networks
Id: WE092-MSGeE
Channel Id: undefined
Length: 29min 58sec (1798 seconds)
Published: Sun Apr 25 2021
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.