Charbel Farhat - Probabilistic Physics-Based Machine Learning for Digital Twins

Video Statistics and Information

Video

Captions Word Cloud

Reddit Comments

Captions

our first speaker our planer today is Charlie Bartlett from Stanford and I don't want to eat too much of the his time so I'm just going to thank you walk in my way this is really it's a combination of some work that we have done and we've contributed to this and we're putting it now to define the framework for visit the point so I'm going to talk about and also the backdrop of physics and for machine learning so I live in the Bay Area where if you haven't heard of a machine learning big data i/o T cyber-physical system industry 4.0 I mean it looks like you'd be sleeping all the time now I'm getting about it is one thing understanding really what it's you and it is a different story and so I like to see all this from an engineering perspective and what we can do to engineering is that this is assuring in here I don't customize decision I'll tell you whatever you might customize decision and this is going to bring me to now these are the things it's enough concept by today's standards of what is it easily goes roughly about a second this is the earliest paper that I could trace that has coined the word digital but I was told that they were in Europe probably early but anyway this is a very interesting it comes from many Google and a number of other people that AFRL and some collaborators in graffia at Cornell in 2011 so it's a it's the digital replica of either a physical asset which is primarily what we would be interested in but also could be digital replica of the process when certification and its significance is that pretty much we can't seem to make anything you want to don't reduce the state in the behavior of the system now what the whole park is going to be about at least in the beginning is how do you simulate these days because there are as many ideas now the characterization in it was for singing that by 2025 we should be able to solve trillions of degrees of freedom to have amount of there is nonlinear opera listing geometric details multi scale fruits well sure you can we do this essentially again throw the kitchen thing at the whole thing and you wanted all the stuff to really be informed by sensors and kind of reproduce the stakes once you update it from sensor information no one really said how you would do it but there was a paragraph at the end that says well it's ambitious but if you achieve even 1% of this that's already great great achievement we've served them the old agree with this now later on quickly if perceived enable came in as machine learning software analytics data centric systems all the buzzwords you know that make lots of people having these days but again not very clear how this would happen now there's a lot of behavior in my opinion that is similar to startups you know be what is called the fear of missing out the female I mean that's a lot of it that for the ending right now almost everything and so real-time processing was clearly identified as something that is needed so let's move fast forward now with no longer 2011 will 2019 what has really changed end engineering a lot of the focus seems to be on quantities of interest now I'm not saying that that's exclusive but I'm saying there's a lot of focus on communities of interest and machine learning has moved and been upgraded in structures from a perceived enabler to individualized enabler now if you want to see what is evangelized I mean I work for students that the minute I saw the numbers problem does machine learning the minute they do an aggression that's a machine learning I mean you read these irritating paper irritating to be that if you are solving a system of equations to determine coefficients well they tell you we're learning the coefficient so the word learning has been degraded to a level that is really substandard with respect to anything but I'm getting older I guess now the German came up with something very interesting which is industry 4.0 which is anyway in other form of digital 20 and the idea is that with all this data centric work you can work with data you can work real time in there to be able to customize decisions that's why I said it's assuring the air or customize decision where you could decide on maintenance no longer based on how many miles required has driven or how many months it's really customized it depends on your vehicle depends on your asset on your aircraft so it is really driven by a digital twin that's going to collect data about think of health monitoring about the system and based on what's happening but whatever you can infer from that data which is another topic you could decide on a scheduled maintenance and this is called predictive maintenance so this is industry 4.0 in the ground this is something that we submitted to DARPA 18 years ago this slide is 18 years old and I mentioned last week with the program manager to whom I pitch this at DARPA 18 years ago he is today at Boeing and he recognized his law and this was the idea was there if you have the physical asset you could build it you can tell how well there's like this we could look at the course resolution because he of the mesh no one can accept such a coarse resolution today except perhaps if you learn something and then you can float the sensor data and then you can update the model to the point where at some point you reach the state where in some metric and you feel free to define any metric you want better be relevant in some metric you are reproducing the state of the area and then if later on after that critical time that you have reach where you're starting you're producing the states of the aircraft if you hit a point for a time interval where you know the state of the universe then we are so confident in your model that you don't question the model this is something went wrong in the resistor is slightly separate dimensions and that's the idea all right another schedule or customary schedules for it maintenance so you can use it also for crisis management we wanted to pitch it for prognosis you for predictive maintenance quality control warranty optimizations they're all great pictures the hardest thing is what exactly you cannot monitor and how decisions so this is the digital twin for medicine where again the idea is that we in medicine there are tons of papers that find that people say that treatment steps to be you know universal like there's one treatment that's a weight loss program that there is a formula that you you follow it whether you see that is not effective particularly individually so overall the picture is the sign that the details vary then you can achieve a better efficiency in terms of people to not gain weight if you years later in this case we use later so and then you have a digital twin to track and follow the patients so that we have a customized medicine here now here is early practice so this is last June from MCs MS AP I mean you would think what Kansas which is a physics-based software probably the biggest MS AP which lives in a different world what would they have in common and I called and I asked and I know that I am greedy and somebody don't need well I mean you'll be surprised si P has figured it out all they can do all these things they can predict this predict that I said hold on a second I'm more than happy believing all this but if they can do it all were they working with you advances so something is not very kosher in that story but I give them kudos that they are propagating you know pushing forward with this technology and I don't know exactly why they do it I have some ideas but there is a lot of interest and the people who potential customers tell me that what you monitor that we believe are associated with failure mouse that you're familiar with now this makes perfect sense but there are many other ways than machine learning that are far older definitely doing this and I can of like them because I've learned something about them recently so you can look at response services for taking on their equipment Gaussian processes I think you're all familiar with this and a lot of machine learning algorithm is being proposed like everywhere for anything you know about how you could minimize the uncertainty with respect to the choice of points where you sample the process in order to get through the process and when I asked them did get something that is really accurate I'm talking about engineering and the answer is well anyway everybody knows that all models are wrong but some are useful I've been giving this for 20 years and finally I decided to go and search who said that and actually since that decision George passed in 1976 for all this how probably this is not a big deal but what I found very interesting is what you also wrote in that paper that no one else I haven't seen anyone quoting it if you have seen please raise your hand and tell me what else when he also said a page or a couple page later is that but then over elaboration and over parameterization is often the market majority and that's pretty much how I feel I wanna see everybody trying to put in on physics base something over the parameterizing to recover something that is of the and if you have not seen it well you look at me much bigger than me because I have seen it that I work with engineers or what we try to be discovered F equals MA quite simply doing that kind of work so the reason is how would you pick these quantities of interests so let me give you an example one thing that as a community in aircraft we're all interested is whether you're going to have it cracked and what's going to happen to that crack well cracks nucleates at the micro scale if not lower at the other mystics game what is going to be or quantity of interest are you gonna try to observe something under microscope I mean most likely gonna observe something at the macro scale but then you have to figure out how to infer it from observations that you'd like to do probably in the lab so on the micro scale so my point is that there are tons of issues and but the open dominating example seems to be temperature in engines because that there is a technology to measure talents of temperature points and collect the data that's a good example for which probably that's the way to do it but I'm a little bit skeptical about whether word quality rate which you could do everything unless we're talking really about failure mounts that experiment really everybody but certainly not in general when you're building a new system that you don't even know at the family it's going to be very hard to go with it now let me give you one example besides the fact that it has a cool movie so Jamie out into many a visit three years ago to talk to me about this problem so these are the passwords that you open up in order to decelerate not to to about 10 times lower speed in order to for 2 minutes to reduce the speed of the capsule that contains the rover and then take it up Bruster's and go to 2 more stages for landing and as you can see they were failing all the time so the design is a sign that goes back to Gemini's the Viking program excuse me the Viking program 1976 the only thing that has changed is scaling the Rovers used to be you know this size now they are the size of a Mini Cooper and they want for 10 years from now they're going to be the size of a mini boss give me a bigger and bigger pastures and then people are doing the test from now and it's strong and then something needs to be done so that said maybe we can what that computation remodeling for trying to see this so this is partly for bragging that we can simulate this today you can see here CFD with AMR fluid structure inflation parachute opening and it was pretty well accepted this pretty slow and then if we quarterly reporter like this with the help of the other between the predicted the results and we can see that the left and the Curiosity data for the inflation time it matches pretty well and then everybody was happy and then they say okay let's try understand Anna why they fail and I said great what is your quantity of interest you want you to look at and they said what do we know you build the model you should tell us what is the quantity transfers so probably this is where something that Christians Irving can really help is in data mining trying to understand what should we look at and also in a setting rather than just solve it inside problem so that brings me to machine learning and I would like to start the definition to set a little bit to set the stage for the rest of the time now what are days where I when I read papers on machine learning look like it's everything and it's pretty much everything if you're solving ax equal means machine learning the minute eight is not squared if you're minimizing something is machine learning if you're throwing some addressing some uncertainties machine during him it's becoming a little bit everyday I went to Wikipedia and I find that actually there is a sign definition of what machine learning is it used computational members to learn information directly from data without relying on a predetermined a question as a model I think probably that's the best definition anyone could come with but I would say that given the context of what I'm going to be talking about whatever talking about it's a little bit misleading not intentionally I would simply change it to say it's not without relying on a predetermined equation as a model it was developed historically for problems for which there is no predetermined equation and that's what made it so valuable and that's what made it so it's so much impact that's why it's called artificial intelligence because there is no model if you already know what equilibrium is what Newton's law is what's the second principle of thermodynamics is why would we want to reinvent it with the data aggression now if you look at it from an approximation Theory that's a different expertise on this because proximation is and is quite possible that one approach for approximating something could be better than in others I mean that's but even there for solving PDEs there's already so much history you know worrying about consistency stability I mean all these things but again it's possible that neural network every shape functions but they really find that historically this is what thank me now having a little bit the small goes small dose of criticism this is a finished in 2017 I don't know how many of you wrote the nips so le Rohini who is at Google one lead that's the time award presentation which is given for this an award that was given for some work we have done ten years ago it had really done some impact and he quoted one of my colleagues Andrew Hammett and would say that they are is the new electricity this is something that was years ago but Ally Romania had a different opinion he said that machine learning has become alkyl and he said has become alchemy because everybody is trying to look at it as the tool that gonna solve every problem without necessarily understanding why in-house now obviously not everybody agrees with what you said they were many logs immediately but you could see that he was saying it very loudly with many people now let me tell you my experience of the chair of the department so we feel that your course called decision-making under uncertainty it has a lot of elements of machine learning and it has a lot of elements of Markov processes is pretty mathematics computer science oriented for an engineering course for an aerospace engineering course let me also tell you that I would class this typically about I mean what I mean the number of graduate students where we were only until recently until two years ago we were only a graduate department and we typically admit 50 people we would across 50 people successfully in the year and at any point in time we have about 230 this is just to give you the reference so we filled it this course in 2014 and we got 15 and in 2018 we have at 400 which means clearly they cannot be all from the department because even if I counted all generations in that department I mean we're only 200 something but we have 400 students for interested in take we're great then if you go to CS 229 now this is called machine learning it has an enrollment and that is higher than 700 in a School of Engineering that passed 3,500 graduate students and that's CS 229 is the graduate work clearly this value that is it's an interesting topic but there is no doubt that there is the perception that when you put it on your CV in the Bay Area you're gonna get a much better marketing your CV is gonna do much better marketing job and probably for good reasons the problem I have is with this is that typical of many graduate courses faculty like to give a final as they take home project and then what they do the people who teach this course because there are many only have 700 students they not that they're catering to many departments from all over particular from engineering they send emails to the faculty to simple people like me to tell them could you please help us come up with some projects where the students can work on machine learning and one day I took issue with this is say what are we teaching them I mean are we teaching them that when you have a hammer looks like your name why the project is an explorer whether there is a need for it and your problem rather than go you know and hit so maybe it's a little bit in overreaction but there is certainly some truth to it I'm working now with students as I talk if they put it in our city they say I'm learning the coefficient there are people who think they are learning fundamental constitutive equations in solid mechanics because they're fitting a stress strain curve never have heard about contravariant and covariant qualities never have heard about conjugate pairs to maintain energy I mean so just becoming that everything is becoming a black box and that I think is a problem unless I am so conservative and all minded that I mean rather that's the future with us but here is what we can see fixations and barriers so machine learning has done tremendous success something that is remarkable I have known many of you use Alexa for benchmark problems on the reading comprehension speech recognition and face recognition I mean kudos I'll even Neil it's a phenomenal achievement that has been done but these are problems where you may not agree with the details but let's look at them as an overhaul you have large amounts of data that are really free they are free for charge to the animals the animals never paid for this data when you go you're sick you go to the hospital you must because well next data on you when the machine learning analyst comes in that data is available the concept of everything and accuracy and I'm sorry I missed your talk so I'm sure that if you're doing machine learning for PDEs you have to define error inaccuracy but when you're talking about reading comprehension speech recognition spatial cognition the concept of error accuracy is not the same as the concept of error accuracy that we do in our physics based monitoring and simulation I mean we know what the error is and we know what accuracy is here we're talking about false positives you were talking about being able to jump very fast for useful you know logistic regression answers is yes or no and pray us out I mean they're all very useful all that mistrust of you that is not the Sun and then the concept of prediction is really more qualitative than quantitative I mean have you recognized the face yes or no no one can answer the you recognize it to ninety-nine point nine to six on the dirac of ninety to ninety nine point zero one try to sell assimilation or in to computational technology if you don't achieve a certain predetermined level of accuracy will respect the experiments and even when you do it they'll tell you what we have the experiment or user stuff anywhere so it's it's four different words if you come to physical systems well there are lots of different physical system but many that I'm interested in there are only small amounts of data that are available if you're an aerospace engineer and you want to work on hypersonic systems well you'll be lucky if your entire career you can put your hands on a couple of bits of there there's not that much we don't do that they're very complicated they're very very difficult and they're very expensive and you have to pay for them and every time the funding agency funds computational works they have to fund also experimental work to provide a forum for their verification and validation and then the concept of prediction is very quantitative it's not about false positive is really about what can you reduce with this experiment so this brings me into a little test here so if we're going to work with a limited amount of data and we're talking data and suppose I give you a function f that I know into points if I tell you well I mean can you predict their function and they need to remediate one two three I mean if that's the only thing we know and we're not trying to be overly smart and then if I tell you well somebody called me to tell me that among the infinite number of circles that pass through these two points there is one particular circle this one that go through these two points would you like to reassess your evaluation of F of three and typically I get the form of an answer well maybe this one or that one not anyone would like to stick with the linear theory and the real issue here is not whether it is the point on the straight line the one on the upper circle middle of a circle the real question is who is going to call me to tell me that there is this circle there is a great interest that I should pay attention to and that's and that's what we goes to physics for if I know something about the physics that govern dysfunction happen it will be really not I have listened to models machine learned based they have different models designed to take off in aircraft where the conclusion was that we have so many parameters that you know the curse of dimensionality is pretty hard to achieve the results in any meaningful real time and then I see in it as models the height the velocity the acceleration what the acceleration is the derivative of the velocity but since you can learn in white models right that the forces the let's whatever equal ax men I mean some I'm in front of the video I'm not exaggerating anything that can call these slides I can sentence you so so that's what it comes to machine learning so this is an example and we're gonna come closer to where that want to come so this is this is a 15 2008 in the 11th year later exercise we did we're predicting the flutter of aircraft so you have the aerodynamic that you cannot see here because it will mess up the picture you have the structural vibration of the wing and you have fluid the fuel inside that is sloshing and we're trying to compute the further speed index which means for which quantity of fuel at which speed the system becomes unstable and then you have to make sure that we eliminate that by this arm and we have hidden truth at the light it doesn't matter so this rotated by rotating with the highly sophisticated computational model so it gives you the FSI as a function of the fill level in the Mach number but we have a database this database is nothing else than the extraction of what you see here so the extraction of some very specific points and it was extracted in such a way that there is no way you can recover this secondary P here and this is not pretty smooth so if you do a cubic regression well you do a little K here but as you can see you cannot recover the word you see it because we have no information about what is there and you can extrapolate to your stable but then if we do an interpolation on the manifold amazingly you get everything you want and that's linear that's not even cubic now of course the real question is what is that manifold what guess what that manifold is a model that you have used your engineering and your knowledge about the physics to monitor the whole process and then you informally buy this database because we all know that all models are wrong but some are useful and then we use the data to correct it and as a result you can obtain things that you would have considered that extrapolation but once you have a model is no longer an extrapolation the mountain is generating them it's a prediction so this brings us now to division there is not necessarily driven by my standard machine learning in the sense of your network not even driven by quantum interest but it's gonna bring me to a mono reduction and because that's what's gonna get me see and then my model will never be perfect you'll have a certain piece so if I wanted to use it for a digital twenty I need to account for these uncertainties anyway that really is effective so this is going to vary the risk and that's what and in the data is what's going to make this whole thing close together Digital twin in the sense that I need to keep updating the model until it becomes perfect by any metric that you decide to go perfectly reduce the day that's a good instructor producing it we can customize the city now what is miss you not even any what amazingly and I find that even none of them is really different like machine learning every one of them is Spiller is machine learning except they will never call machine learning because not everybody who was solving an optimization problem so it is one slide I learned something from Nathan and one of your talks that you can summarize mother reduction it was long so I have done exactly this mother reduction in one slide and this is what it is you start by something that is different than elsewhere you pick a model you believe in and let's put it in the fridge for the moment let me first we formulate a hypothesis we're going to formulate the is that my quantity of interest might fill my solution by whatever you looking at can be modeled with an effete something in a teal subspace will the reduce phases well that's pretty much what you do in the first step of machine learning when you say that I'm going to model wherever I'm interested in there is a point function there's a matter of factly formation might not tell me that then they're moving from sigmoid function surfy functions because they figure out that you know they run faster the other thing is that we can be smart and saying that I don't necessarily want to have all these layers of encoders perceptrons and all the stuff there are probably other approaches I don't know they are better or not but they have been other approaches to build that reduced basis by clearly it comes from training from things that are very well known engineering them is a design of experiments well that's the same training that you see then formulate a loss function well my mass function is to say that whatever approximation I'm going to get out of this assumption if ever satisfied my model because I wanted to satisfy the physics so when I want this assumption with the training for there's new spaces to get my stakes into that residual and say I want to minimize it well I have to pick the disciple of a machinery with one difference I have a physics model that is defining the loss function instead of defining a loss function based on purely data concept so that gives you a physics-based month of machine learning approach and actually German equation because I pick my mother at the beginning and this moment could be anything that you believe in and what you're gaining out of this is something to be able to go fast at this speed we're not trying to ruin anything that the model could not teach us except perhaps that now we have a parametric space mu we can explore in much larger view space much faster and therefore we could learn through things much faster but fundamentally speaking literally the same thing that art has but faster than death what the idea is so we have been recently interested in saying how we can ship information from other reduction to machine learning for the physics-based machine learning and also the other one because there are lots of things that if I have kind of what people have done in neural network that can go the other way and so this is something that's pretty old this is about the concept of local reviews on their basis which means that now we have shape functions that they are global in a certain sense but they are local with respect to the manifold of the solution that they're trying to try but there's still a global in space so this is a trajectory of a solution of a PDE that the solution is called Y T is the time so this is here you know is space time-space rejected for some parameter configuration you want so mu is the vector and want to say that this is first right and on this target which means i'm sign language so each one of you sign let's restate vector at that one and then I changing you to so that's the three I got another one is linear and then I just erased everything just to highlight that this is now big data I have all this data it is sitting there and what you want to do with it well if you observe is in this region you have circles we have triangles and you have squares which means that no matter how you change the parameters the solution has a tendency and this is not by the way manufacturer day that that's a true data the solution had the tendency to go back there so anyway it has some notion of attractors in the chaos theory but it doesn't go back here is this area at the same time given that these not part of it cause my time but this is what you say okay that's an ideal problem for classification so we can run the classification algorithm so this is Michigan orignal techniques that are being used as part of the solution 3095 how many regions are really fundamentally different and then you can now compress the data in each one of them separately and now instead of having one global reduce basis you have a catalog of reduce places like this and then when you saw me online the problem every time you have to be able to identify which one of these were the closest and we can put this between time and then you can use the reduce basis that is here and then you updated now very recently I had a very talented graduate student who said should be able to do the same thing with neural networks so she came up with it network you could reproduce the same type of thinking with something that she holding you know it is a function that means there is a context network which is trying to track the solution of the PDE on the manifold of the solution so that you don't use the full network fully connected everywhere that you can track it so this is these are ideas that came from one way to go to the other way and there are ideas that go and actually that worked pretty well the more important thing is that there are meanings to them which parameter percent of when you do lose writing the memory machine learning you have decisions data that is sitting here it's for you to go to try you approach them but the data is there this is a different approach here where we have to generate the data ourselves so it's really designed of experiments that is going to ask us which data are you going to generate but then the part of the important question at the end of the day will also make an optimization problem most of the time if not all the time it is non convex and therefore I would like to know how much data is needed so that I can trust what I'm getting I mean we come from a culture of PDE where we have the concept of mesh refinement and time refinement and we have theorems that give us error bounds and give us confidence in the results I mean I have data here it's great if it solves a problem that there is no other approach to solve it but when there are other approaches to solve it I'd like to know what is specific for anything that's gonna have a loss function for the minimization because if I can't control the amount of data and I understand how much data is needed some people probably not all of us but some people are gonna have a hard time you know in this now one thing that I am not aware of that has been used in the context of machine but I know that it's pretty popular with muscle reduction and very popular in simple PCA analysis everything is the concept of a greedy procedure where the greedy procedure tells you which data you generate and how much is it not but I can that makes sense in context when you're generating the data not in accounting or somebody gives you the data so these are two different - or not so if this is my parameter space and I took one that is 2-dimensional simply to illustrate the concept you start from any initial point and all what you need is an error indicator not even an error estimator and your error indicator is the residual of that that famous residual which is the predetermined depression the equation based predetermined model that you believed it but that you want to improve its quality and speed so you can take now this initial point you can build it on here and then you can generate random points inside the parameter space you can use the veterinary something you can use a coil design you can use anything more money and you can use the error indicator to see where at which point you're doing the worst not the best because if you're doing the worst then you don't have much information there so this is the point where you doing the worst you take it you generate more training to generate more data here and so you have conversion through certain designed accuracy which means you have picked the data your interest at the end so there is no need to come after with something like an active subspace or something to frighten it used it here we have generated by definition to be reduced and also we now now these are some results that it's not careless morning calivoodoo yeah so these are the results generated from count who moved the bombing in an embalming to Amazon and so this is a very realistic simulation this is very something that Boeing which is nice so he had a crunch something flow you have a CFD model that is hollering disturbance modeling and even wall function yeah a small dimensional parameter space of four dimension only and what would interested in what is in are you so we're interested in things like if somebody comes and say but what would happen if I make this going little bit longer so this is like in loop in an optimization so you'd like to be able to refine to say well your practice then and move by ten points are you really sure you want to do them so you want to do these things so Kyle has trained this and then you want to make sure that after the training when you go online everything goes pretty fast so there's the concept of hunger reduction and it doesn't matter you don't need much background it is what's happening you know that when you compute the integral from 0 to 1 of f of X DX what's the best technique you know for approximating that you know quadrature rule but if you apply what is a rule it means that you really don't need to know the knowledge of F in the entire interval 0 1 you only need to know it at a few points and then the question becomes if I had a he which CFD the line around the set can I really compute what I'm interested in without having to use every one of these good points so that's one way and you can do this when you're working with reduced quantities because they are of lower dimension than high dimensional quantities and then this is what hyper reduction is is say that if you're trying to compute the projection and fill model when you reduce faces you can do it by some quadrature rule and you get the coefficients of the quotient rule and the subset of the elements that you see here essentially by doing machine learning by building a loss function minimizing that loss function and training over a large set of functions to be able to determine the subset on Michigan compute the flow and the weights to be able to reconstruct your flow based only from this and that was before the days where people were talking about physics-based and she learned so same concept so this is the minimization problem by a common ingredient of suits and you can get the same thing now so this is an example of how predicted this can be so the use of the model can be built in two hours from 24,000 course this is now running on a laptop in less than three minutes at a sample point and then this is a count accuracy that you can get so the physics are respected and that's what you would expect now I have to say that I have a good friend to whom I have a lot of respect George Community School has been showing how you can use like many other people neural network shape functions too salty knees and then at the end of the paper there is a paragraph that explains how you make them physics based what he said you put them into the residual you minimize and you get the coefficient so I sent him an email saying great that's called a reduction so would essentially be discovering you know the same thing so different things in solid mechanics this is for even contact anything gets speed ups or factors with 2 million but it took a decade to get this to work so he went from a high dimensional mother of two million to Raimondo dimension 31 who produces very much you can see the same accuracy that you want for some difficult problem and then comes the problem of mother form uncertainty which is really going to bring us to the crust is that now we know how to compute fast but it's not gonna be very predictive because there are uncertainties that's how each point now there are all kinds of uncertainties but I'd like to focus on some so if the enemy aircraft that is like this there are still tons 99% of the papers and you know please check if you think I'm exaggerating anything there is 99 percent of the paper there look at the parameters of something like this and claim that the uncertainties are in the denominators so he randomized the parameters and then you throw a conservative well if you're in charge of a finite element model something is commonplace if you want to go from this to this you realize that before you even worry about the parameters whether you know the young models of the aluminum correctly or whether you know the patterns there are so many decisions you have to make about what to draw and what to include about what are the boundary conditions between various elements of the transmission conditions I mean there whether there's a free pie that is important or not whether are we going to get into the elastoplastic regime or this is gonna remain I mean there is so many decisions that we have too many that by the end of the day it's like there's you know stop them and I'm going to be and these are the uncertainties that you really have to go and in that case I'm going to show you with a simple picture why I don't believe that randomizing the parameters which is the dominant approach really helps much so let's suppose that I'm solving as a doctor is missing you've got beef whatever you get and I have my parameters and I even split the parameters to be very careful with with something that is uncertain and some uncertainties and then I have my experimental data there is here that's nothing literally mystically let me keep varying this parameter view and when I find it I generate this face of the solution so you can see that he's not going to find new experimental well so if I go now I make this random for he's not gonna find it either is going to rediscover essentially the same subspace with only one more information in each case what is the probability is going to be this one rather than that one they're not gonna give you anything else so well we have couple years ago came up with is an approach where we said let's one demise is a subspace in which were computing the solution but for two reasons one is that I will never violate the physics because there is no policeman or police woman that I know but can I come and tell me why I think that's a function versus that others wavefunction well if you start by randomizing the parameters and you find that you get a much better result if your parameter means value is this but you can go to the lab and we measure that parameter and find that it's not this essentially you've got the least square solution here that is just because we wizard formed for multiple things that gives you the right answer but is it really the right answer so here we're taking the subspace English were to get into solution and by changing the subspace diagram in the I can increase the scope of my solution without increasing his dimension I mean that's that's really the trick but there will be computational cost because cops didn't grow even with what a constant I mentioned different of the process so that's the idea energy which means essentially if we're working with a reduced basis or with roms because when he drops anyway to be able to do the Monte Carlo simulation pretty fast so the idea is to randomize the basis which means that now we're working with a family of basis instead of working with only one basis and we have a whole page about how to do this so we have to satisfy that this basis when it's randomized it remains it exists so we force something slightly stronger which is orthogonality so we're going to randomize this basis on a handy holder manifold and in order to be efficient and this comes another machine learning problem we're privatized that thing so we're not going to randomize every coefficient of that basis because the problem becomes untrackable what we're going to do if the simply say that if this is the manifold in which sits the reduced basis in a deterministic this is typically Rahman Yin this is the tangent space I start from some set of parameters and we have to be able to decide how big is the size of this so these are random I can use them to control the randomness of unitary matrix you just like you use control points for a spline to be able to generate this line and then I do some simple transformation of you that is written here that automatically makes you sits in the payment plane to that manifold and then I do a poor decomposition to break this matrix to the manifold and now I have that satisfies the physics that I want while still randomizing devices so now I have a stochastic model and let me skip the mathematical details and this is now where comes another learning process so now I have a model that has a reduced spaces that is stochastic so it's going to give me families of solution I can do real-time Monte Carlo simulations so for a given alpha if I knew the Alpha is going to generate to me stochastic solution they're going to have a mean that are going to have an average mean and then I want to have standard deviation they're gonna have many moments and then when you go to my data if my data is deterministic assuming we have data is available then it's mean is itself a standard deviation I could design a mock up one and then if it's statistic I have it has its own statistics then I'm going to formulate the cost function let's say that my alpha is such that the mean and the standard deviation of my stochastic reduce order model which means the mean and the standard deviation of the solution produced by my model have to match the ones that are available in the data so I have to solve this optimization problem and here we have been looking at even machine learning ways to solve this like using diffusion maps will produce the learning or using simple stochastic [Music] but the point is that once you determine output you have the random matrix B there is whose randomness is controlled by alpha they will give you now a family of solution U and then build another physics based machine learning approach to be able to extract information from data and insert it into the bottle because inserted into the reduced basis that is built to build the model so how many do couple examples this is a weight that was built at the University of Minnesota this is and then there were two teams that were asked to really identify the natural frequency for this approach so that you can build the controller to fly and then in each case so T one is called team two is handy and for each model so this is an eigen value problem they were given two different approaches to different method the same packages were both of them to use to determine the natural frequencies and the eigenvectors now beautiful is amazing it's supposed to be the same aircraft we all know that in practice it cannot be exactly the same because you built it twice but it's supposed to be overall the same these are the same methods used by both teams so you can see that for the first frequency what everybody found more or less the same thing for the second frequency the second team could not identify for the third frequency only the second team who were not able to identify the same first he has everybody got more or less the same thing first frequency everybody miss everything for this fixed frequency the differences are over 25% I mean already if you have more than 5% discrepancy you can do the controllers a 25% is just you don't know which one to believe so when people say data is more important than model I mean maybe that's a counter example as the French say candle example shows you that there is a rule so and then so we're able to simplify that element model and look at all these sources events jokes in here we're building as a state model which means only wood beams and bars this is made up of composite so we have generated a composite mass distribution no one had time to do the mass distribution it was like the total mass market at various point and there is damping but no one has an idea for his damping so it is unknown so we built all these and then we melt them so we know we're going to have uncertainties and then we put and we're going to use now the data let this be measured with the caveat that even the people among themselves don't agree on the data and this is the result that we get so the black dots are the points where you have the data and you can see that for some frequencies we have two points it would be close or they could be pretty far apart now the deterministic model which is alignment in gold occasionally matches the data and then most of the time it really doesn't matter and before it could be close to one perhaps it's what very the other so it's very difficult but if you had a stochastic model that has extracted information from this data to reshape its own basis so that's we come to the digital in concept that is untainted if you see anything all cases the range they triple that it gives you for a confidence it is 498 which means at 98% in all the cases in brackets you know which means it has learned from it and now you can trust it for putting in this era and this is my last example time is up so this is and you can see that in all cases the fantastic one which has it's always bracketing the experimental later so once you have achieved this results for a large dimensional parameter and show the space so I've been thinking about it what is the really usefulness all over this so many of us so this is my view of the pyramid of modeling and simulation I view it as a game with three layers so the bottom layer is modeling and simulation for prediction I mean quantitative prediction so that's going to be physic space but then there is a lot of interest and they stopped it on it that is narrower like any type of education in something now as modeling and simulation for enterprise making you want to be able to make decisions on something so these ones are physics based this is where everybody talks about data analytics and everything that you want now if we're looking at in the waiting in the aircraft or trying to optimize the bottom line of your corporation the physics base people are only making over the performance of an aircraft in terms of we all understand drag skin friction coefficients turbulence laminar flow but the people are making the decisions live in a different world they only care about one thing there's only one optimization function is called the cost function that's that's the one that really matters and so they're looking for models they're looking for data analytics based model in the military world what they're talking about is what they call the campaign level simulations which means no one is interested in what is the performance of an f-18 or an F 30 to 35 or enough won't do but like if you were trying to strategize is how many of them would you use knowing that your enemy has this and they're now at the bottom all these things can be traced to the physics because these are physical systems but no one is interested in going to that level so I really find that this missing here that you see here this is where physics informed machine learning to become well not necessarily over models even though that would be great but to do something that we still haven't figured data mining has been a good exercise to build it and if it's you know a lot of this is for problems where failure mounts are not known Apriori where I do not know which quantity of interest I should be monitoring I give you a good example that if I'm interested in health money coming up in the aircraft and cracked the first thing that comes to mind you're gonna be measuring in the real time with an aircraft is flying your sensors are going to be measured quantities at the macro scale most likely but failure nucleates at the micro scale so how do you connect between the two if you're putting out a new system that you have built you don't know yet the failure mode so you cannot use so all these families of problems were you still need a model I believe that that's one approach that is reasonable for that and where machine learning techniques here have been used is not to teach us something about the physics is to teach us how to build an alternative model that is much faster yes we have invested in training so it is really not to teach us anything what we have learned is how to build a better model once we have generated training data so what kind of problems is that the kind of problems for which you don't have data you have to generate the data yourself whether it's experimental or numerical and then it comes a problem like that but there are many grams which are not like that today engines on an aircraft are instrumented with up to 4,000 sensors they can generate lots of temperature I am sure that there are failure amount an engine that can be tracked by temperature a few points but nummy-nummy aircraft is a multidisciplinary system that has millions of different pieces and devices in it not everything is eternal time and these won't necessarily give me a crystal clear idea of what's happening with respect to cracks I hope I answered your question it's really for all these problems that are governed by physics for which there are PDEs that are not perfect but they're not really questionable they may have elements in them like a turbo this monitor is questionable or not so but no one is going to question you know the second principle of thermodynamics or at least yes so you an opportunity to push your mind on to at least the graduate undergraduate or in that class given you know that would you say how would you you were teaching in the department teach that class as opposed that how someone in the computer science department teach a class with the same title which one's the real client decision-making yes for the moment it's taught by somebody that they have the utmost respect for who uses an approach that is halfway in between he uses Markov processes so it's not purely mystical tries to get a little bit informed by the physics of the problem but the class gives examples and I hope the Mayans are going to be clear from problems where the models are not complicated which means the PDE by definition is low dimensional so let's say the rigid body or the problem of collisions between the aircraft if you want to write a model for the collision you won't end up with something you know like a PDE so the real answer is I don't know I'm still trying to figure out but when I figured out how to do it they will offer a class to exactly that but I don't know how to do it and probably I am admitting that it's easier to criticize and to do something better but I think we start by first realized that that's not the best thing to do but we have a great to continue to figure out how to do a better one and I don't have an answer right now and part of it and have some here is that the answer level if you wanted to teach a lot of these things I mean in a class I mean the part becomes so high for people to to walk in and take that class they would need to know something about I mean lots of things that about PBE they know about number reduction they need to know about randomization is about QQ and it just it's not it's not clear to me how I put this in class yes clearly what's the minimum even though attending bicycle system that's the best topic for an ER C for NSF because everybody is banking on these advances in the Bay Area and the success of machine learning and everybody would like to see how this can be transported to other from I know one thing if it is transportable is not going to be simply buy a hammer that makes everything look at night it's simply by getting inspiration from there and getting some good concept and migrating them and maybe migrating concept also the other way to get something and that's what I think is really worthwhile other than downloading tends to throw a pie towards and you know generating training data and I mean my own students are doing this right now and then just drive me nuts because if I don't help them they tell me that I'm very selfish I am obtuse and only my opinion counts and if I do what they want me to do all right [Applause]

Info

Channel: Physics Informed Machine Learning

Views: 1,838

Rating: undefined out of 5

Keywords: machine learning, physics, reduced order models, data-driven discovery, deep learning, artificial intelligence, neural networks, dynamical systems

Id: GA0-sHSgaVs

Channel Id: undefined

Length: 69min 30sec (4170 seconds)

Published: Wed Jul 17 2019