Liquid Neural Networks, A New Idea That Allows AI To Learn Even After Training

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
AI offers boundless potential, from enhancing health care diagnosis to making our cities smarter, empowering us with education. The potential is immense. While AI can revolutionize so much, realizing this potential also rests on our ability to solve some really important technical and societal challenges. So I would like to talk with you about a new idea for machine learning we termed liquid networks. We began to develop this work as a way of addressing some of the challenges that we have with today's AI solutions. Because despite the great opportunities, we also have plenty of technical challenges that remain to be solved. So first among the AI challenges is the data itself. We require huge amounts of data that gets fed in immense models. And these models have huge computational and environmental costs. We also have an issue with data quality because if the data quality is not high, the performance of the model will not be good. Bad data means bad performance. Furthermore, we have these black box systems where it's really impossible to find out how the system makes decisions. And this is really problematic, especially for safety critical applications. So let me show you the essential idea behind liquid AI. And I will show this to you in the context of an autonomous driving application. And then we can generalize to others. So here is a self-driving car, which was built by our students at MIT using traditional deep neural networks. And it does pretty well. It was trained in the city. It drives really well in a completely different environment. It can make decisions at intersections. It can recognize the goal. So it's pretty good, right? But let me open the hood to show you how this vehicle makes decisions. So you will see in the right-hand corner the map. You will see in the upper left corner the camera input stream. And the decision-making engine is the big rectangular box in the middle with blue and yellow blinking lights. And there are about 100,000 artificial neurons that are working together to tell this car what to do. And it is absolutely impossible to correlate how the neurons activate with what the vehicle does because there are too many of them. There's also half a million parameters. Take a look at the lower left-hand corner, where we see the attention map. This is where, in the image, the vehicle looks in order to make decisions. You see how noisy it is? You see how this vehicle is looking at the bushes and at the trees on the side of the road? So this is a bit of a problem. Well, I would like to do better. I would like a vehicle whose decisions I can understand. And in fact, with liquid networks, we have a new class of models. And here, you can see the liquid network solution for the same problem. Now you will see the entire model consisting of 19 artificial neurons, liquid neurons. And look at the attention map. Look how clean it is and how focused it is on the road horizon and on the sides of the road, which is how I drive. And so liquid networks seem to understand their task better than deep networks. And because they are so compact, they have many other properties. So in particular, we can take the output of 19 neurons and turn them into a decision tree. Now, that could show the humans how these networks decide. And so they are much closer to a world where we can have machine learning that is understandable. We can apply liquid networks to many other applications. Here is a solution consisting of 11 neurons. And this is driving a plane in a canyon of unknown geometry. The plane has to hit these points at unknown locations. And it's really extraordinary that all you need is 11 artificial neurons, liquid network neurons, in order to solve this problem. So how did we accomplish this? Well, we started by the continuous time neural network framework. And in continuous networks, the solution, or the neuron, is defined by a series of differential equations. And these models are kind of temporal neural network, which includes standard recurrent neural networks, also neural ODEs, continuous time (CT) RNNs, and now, liquid networks. And it's really extraordinary because, by using differential equations and by using continuous time networks, we can model very elegantly complex problems, like problems that involve physical dynamics. For instance, in this case, we have the half-cheetah standards. And it can be modeled elegantly with these continuous time networks. However, when you take an existing continuous time solution and you model even a simple problem, like can you get this half cheetah to walk, you actually get performance that is not that much better than a standard LSTM. And so, however, with liquid networks, you can do better. OK, so how do we achieve this better performance? Well, we achieve it with two mathematical innovations. First of all, we change the equation that defines the activity of the neuron. We start with a linear state space model, and then we introduce non-linearities over the synaptic connections. And then when we plug these two equations into each other, we end up with this equation. And so what's interesting about this equation is that the time constant that should go in front of x of t is actually dependent on x of t. And this allows us to have neural network solutions that are able to change their underlying equations based on the input that they see after training. We also do some other changes, like we change the wiring architecture of the network. And you can read about this in our papers. And so, now, let's go back to the attention of a whole suite of networks, CNNs, (CT) RNNs, LSTMs, and other solutions. So back to the driving in lane problem, you'll see that all previous solutions are really looking at the context, not at the actual task. And in fact, we have a mathematical basis for this result. We can actually prove that our liquid network solutions are causal. In other words, they connect cause and effect in ways that are consistent with the mathematical definitions of causality. Now, I promised you a fast solution. But these networks are defined by differential equations. So you might ask, do they really need numerical problem solvers, because that would actually be a huge computational hit. Well, it turns out we have a closed form solution for the hairy equation that goes inside the neuron. And the solution has a good bound. It's good enough. And you can see in this chart, in red, the ODE solution, and in blue, the solution with our approximation. And you see that they are really quite close to each other. So these liquid networks can learn causal relationships because they form causal models. Unlike other models defined by differential equations like neural ODEs and (CT) RNNs, in essence, these networks recognize when their outputs are being changed by certain interventions. And then they learn how to correlate cause and effect. All right, so let me give you a final example to convince you that these networks are really valuable. So here, we have a different problem. We are training a drone how to fly in the woods. Notice that it's summertime. So we give our drones examples of videos like you'll see in this example. And these are not annotated in any way. And we train a variety of models, for instance, a standard deep neural network. And now, when we get the standard network trained in that environment to find the object and go to it, you see that the model has a lot of trouble. The attention is very noisy. And then also notice that the background is different because now it's fall time. So the context of the task has changed. Because deep networks are so dependent on context, they don't do so well. But look at our liquid network solution. They are so focused on the task. And the drone has no problem finding the object. We can further go all the way to the winter with the same model trained in the summer. And we get a good solution. And finally, we can even change the context of the task entirely. We can put it in an urban environment. And we can go from a static object to a dynamic object. The same model trained in the summer in the woods does well in this example. So this is, again, because we have a provably causal solution. So liquid networks are a new model for machine learning. They are compact, interpretable, and causal. And they have shown great promise in generalization under heavy distribution shifts. Thank you. [APPLAUSE]
Info
Channel: Forbes
Views: 108,533
Rating: undefined out of 5
Keywords: Forbes, Forbes Media, Forbes Magazine, Forbes Digital, Business, Finance, Entrepreneurship, Technology, Investing, Personal Finance
Id: 0FNkrjVIcuk
Channel Id: undefined
Length: 10min 39sec (639 seconds)
Published: Sun Jul 09 2023
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.