(bell dings) - Hello, and welcome to
another Coding Train video about neuroevolution. So this video, what I'm going to do is, I'm going to take my previous, finished, sort of finished, version
of neuroevolution, the process of evolving
the optimal weights of a neural network to solve some kind of, perform some kind of
machine learning task. In this case, I'm making a guess based on the environment of this very crude Flappy Bird game. I'm not making this, the neural network is making a guess whether it should jump or not. So this is a very simple,
actually, scenario to solve. And so it's a nice
demonstration of the idea. I've been meaning to return to
this topic for quite a while so I can try to look at
some more complex scenarios. So neuroevolution can be
applied to a lot of problems, at different, reinforcement learning
algorithms are also applied too, but it's a pretty different technique. There's like five, or six, or seven videos all about neuroevolution that I would recommend you check out, or you can start right here. Because what I'm going
to do in this video is, I am going to take the
existing version that I made, which used a toy neural
network JavaScript library that I implemented in
yet another video series to, kind of, learn more
about neural networks and spin up a very, very basic one in JavaScript without other dependencies. But I'm going to replace
that now with TensorFlow.js. What you're seeing here is the last example from Part Five, where I saved a trained model and I'm loading it into
a version of the game and watching it play out. What I did previous to that, is what you're seeing right here, which is, this is launching
500 random neural networks to try to make guesses. I can use this slider to
sort of speed up the system, and over time those neural networks are going to perform the processes of crossover and mutation. It's not actually doing crossover, it's doing crossover which
would be the act of combining two agents' genetic
material together into one. I'm just taking one and copying it, but then I am mutating it to optimally search for the
configuration of weights that will work the best. And just to remind you, when
I'm talking about the weights, this is what I'm talking about. So we have the Flappy Bird game, this is the agent that
needs to make a decision. In a way it's a classification problem. Should I go up or should
I not go up, right? These are the only two possibilities, a very simple classification problem. Two categories. I have done, I'm the human being, have basically done
the feature extraction. I could use the environment,
the input to the neural network as this image, like all
the pixels of the game, that would be a wonderful thing to try and I would love to do that, especially once I have
convolutional layers with TensorFlow.js, which is not something I have in my toy neural network library. You might not know what
a convolutional layer is. Don't worry, if I use
it, I will explain it. But, I have done that feature extraction. So I have decided that the features that I want to use as inputs
into my neural network are, I think it was like,
the bird y position, the bird, y velocity, the top pipe location,
the bottom pipe location. And then, I'll call this x, the distance to the nearest pipe. So these are what I've decided might be all the important
values to use from this game, to feed into the neural network. So that means the neural network has one, two, three, four, five inputs. These all get normalized into a value with a range
between zero and one, and fed into the neural network. Then the outputs is a
classification problem. So there are just two outputs, and they would each output a number. If I got something like
0.8 here and 0.2 here, that means there is an 80%, basically, an 80%
probability confidence score that I should jump, so I will jump. Now I could actually pick random numbers and that kind of thing, but I'm just going to
take the highest one, the arg max, so to speak, and go. So, neural networks are able to learn a sophisticated amount of
information through hidden layers. So the inputs don't pass
directly to the outputs, but the inputs pass
through a hidden layer. And I don't recall what I picked as the configuration of the hidden layer, but I think it was something like eight. So if there were eight, I
think that's what it is. We'll look. One, two, three, four,
five, six, seven, eight, these are what are known as dense layers. Meaning, every single node
is connected to every node. I will not sit here and draw all of them, but I will start that process. And then all of these, sorry,
are connected to the outputs, like this. So a neural network's core data, the configuration of
that data is in a matrix. 'Cause every one of these
connections has a weight. So if there are five here and eight here, that is 40 weights. If there are eight here and
two here, that's 16 weights. Now the truth of the matter
is, there's also a bias. So there's a bias for each one of these and a bias for each one of these. So the configuration
is all of the weights, all of the biases for each of these nodes. But these are the details
that I cover a lot more in the series about
building a neural network from scratch in JavaScript. So if you want to learn
about that, you can go back. But all you need to know for here, is that I have a way of
creating this neural network. I have a way of feeding
these values into it and looking at the stuff that comes out. So let's go look at that code. So if I go look at the
code, we can see that here, there's this idea of a bird object. And the bird object makes a neural network with five inputs, eight
hidden nodes, and two outputs. I don't know what I've said
here, but this is known as a feedforward neural network. Multilayer Perceptron, it's two layers. The inputs, it looks like a layer, but these are just numbers coming in. There is a hidden layer
in the output layer. That's important terminology. So that's happening here,
and you can see in the bird, the bird sort of like, takes
all of those properties, its y position, the closest pipe, the closest pipe's x
position, its velocity, makes that into an array, feeds that into a predict function, and then I just sort of like, if the first output is greater
then the second output, jump. So this is that process. And all of this happens in this neural network library. So there is neuralnetwork.js
and matrix.js. This is like if TensorFlow.js
was made by one person who really didn't know what
they were doing (laughs). That's what's in here. So what I want to do, is I am
going to delete this folder. Right, I'm deleting it, it's gone. Gone, move to trash, boom! So now when I go over
here and I hit refresh, of course, it doesn't work. Because we have no neuralnetwork.js. Can I get this, I don't have a watch on, in however long I have right now, in the course of this video, working again without my library, but with TensorFlow.js instead. That's the thought experiment here. So here is my html file, and you can see that it
has the P5 libraries, it's loading my neural network library, and then these are all the files for the Flappy Bird game itself, as well as the file that
includes some information about how the genetic algorithm works. So what I want to do, I
need to take these out, and then I want to import TensorFlow.js. So I have the TensorFlow.js website up and this is the script tag
for importing TensorFlow.js. So I'm going to add that here,
and actually I am pretty sure that the most current version
is 1.0.4, so let's add that. And now I'm going to go
back here and hit refresh. Now, of course, ah,
NeuralNetwork is not defined, so we have fewer error messages
here now, which is good. I just want to make sure tf is loaded, so, for example, I can call tf memory, and there is no memory being used, but I can see that tf js is loaded, 'cause I can call tf
functions in the console now, which I'm going to need to
do to figure all this out. All right, so now the thing
I think I want to do is that, actually, I don't need to do this, but I think it would be
nice for me to create, I'm going to create a file called nn.js, and I'm going to make a little
wrapper for a neural network. And actually one of the reasons
why I want to do this is, as you know, I've made a lot of videos using a library called ML5, which is a wrap around TensorFlow.js, which allows you to work with some of the machine
learning algorithms without having to manipulate the lower level details of TensorFlow.js, and so this is, kind of, a little bit, of maybe a preview of that. Because this ultimately, this idea of neuroevolution is something that I would like to work on
putting into the ML5 library. So if I make a neural network class, and I write a constructor, we know here, that in the bird object what it's doing is, it's
making a neural network with five inputs, eight
hidden nodes, and two outputs. So there's no reason why I can't actually just keep that same structure, and I'm going to have three arguments, I'm just going to call them a, b, and c, because at some point I might need to call the constructor in different ways, and I'm going to then say, this.input_nodes equals a. this.hidden_nodes equals b, and this.output_nodes equals c. Then I'm going to put, I already, by the way, I did this in a coding class that I'm teaching, last week, so I kind of have quite a bit
of the sense of the plan here. So I'm also going to write a function called this.createModel 'cause I might need to do
that in different places and then I'm going to
make a separate function called createModel. So this now is the function, where I want to create my neural network, using TensorFlow.js. And I'm going to use the layers API. The truth of the matter is, this particular architecture, right, this is perhaps one of the
simplest, the most basic, vanilla, so to speak, neural
network architectures. It's got two layers, one
hidden layer, very few nodes, there's no convolutions, no fancy stuff, just the very basic. So I could probably do this with just simple TensorFlow
operations itselves, but I'm going to use
something called tf.layers which is actually based on Keras, which is a Python library for TensorFlow, and Python, so many things, I've talked about these in videos. But basically tf.layers is an API that allows me to create a model from a slightly higher level perspective. So one of the key functions in tf.layers, and I'm going to create
a variable called model, is tf.sequential. So if I say tf.sequential, that should create for
me a sequential model. And we can just take a look
at that here on the console, and see, there we go. It made something that works, you can see all these
parameters that it's training. Now interestingly enough,
what I'm doing in this video is not really what
TensorFlow.js is designed for. Almost everything in TensorFlow.js has to do with optimization functions and algorithms for learning and tweaking weights,
and back propagation. I'm not doing any of that. This neural evolution
thing is quirky and weird and all I actually need is
a bunch of neural networks and the ability to feed
the data through them and then I can just delete
that and mix them up. So actually most of this
stuff will not get used. But what I want to create
then, is my first layer. So the first layer, I'm
going to call hidden. So I'm going to say, const hidden equals tf, I think it's dense. So I think I better look up the
documentation at this point, I certainly don't have
the stuff memorized, so I'm going to go to the
API, tf.layer's dense. This is what I'm looking to do. And I can see a nice
little example of it here. This. So I'm going to create a tf.layers dense. Dot dense. Then I need a little object which is going to contain all of the configuration
information for that dense layer. And what are some things I can configure? I can give it the number of
units, that's my hidden nodes, which is the hidden nodes. I also want to tell it how
many things are coming in, like what is the input
shape, so this is the layer, I need to tell it about the input shape, I forget what that's called,
somewhere in here I'll find, inputDim, which would be the
input dimensions, I guess, which would be this.input_nodes, I'm forgetting my this dots,
as always, I always forget. So there we go, what else do I need? Ah, I think an activation
function I definitely need. So activation. Now this merits quite a bit of discussion. An activation function is
how the data gets processed as it's fed forward. So all of these inputs, the numbers out of these
values come in here, they get multiplied by
those weight values, summed together and passed
through an activation function. The activation function's job is basically to squash the values, to
enforce non-linearity, to allow for the stuff
to go to the next layer. And there are all sorts of reasons why you might pick one or the other, there's more recent research
and change over the years, and I've talked about
this in other videos. I'm going to pick sigmoid as a kind of default activation function that squashes all values
between zero and one. Now I'll include some links about sigmoid in the video's description,
if you're curious. So I forget exactly what I do here, but I think I can just
write sigmoid as a string. And somewhere in the documentation there's a list of all
the activation functions. Do I need anything else? I think that's good. So then I just want to say
this.model.addlayer, maybe? So tf.Sequential is what I'm doing here, which would be here, and oh, input shape is
probably what I want, not input dimensions, we'll see. I'm noticing that there's input shape, actually I think that's what I want. That's certainly what I used previously. So input shape, and it would be just that number of nodes, but inside an array. A shape is a way of defining
the dimensionality of data. So if it's just a list of
stuff, like a list of numbers, it's an array with five things in it. So that's how I define a shape. So then I would say, model.add so this.model.add the hidden layer. Now I want to make an output layer which is going to be tf.layers also dense, and I want to, what do I want? The units is this.output_nodes, and then, I actually do not have to define the input shape for this layer, because it can be inferred
from the previous one. So I couldn't infer the number
of inputs from the hidden, because hidden is first. But the last layer here, these outputs, I know that I just added this layer, if it's dense, this is
the shape of the input. So all I need to do is that, and I also need an activation function, I'm going to make it softmax. So softmax is a activation
function which writes sigmoid, squashes all the values to
arrange between zero and one, but it does something even more. It basically enforces
that all of those values also add up to one. It basically turns them into probability or confidence scores and all
of them add up to 100% or 1.0. And there's more to it than that, it's a kind of interval
mathematics involved, but that's the basic gist of it. And that's what I want,
I want the probability of whether I should jump or not jump. So now, I should be able to say, this.model.add output and then if I go back
to the documentation, I don't need to do this fit stuff, but do I need to compile it? So interestingly enough,
usually we need to compile, we need to give a loss function. Traditional machine learning you're going to have a loss function, an optimization function, you're going to train it with data, I'm not doing any of
that in neuroevolution. Remember, if you watched my
other videos on neuroevolution, I'm doing this weird thing, I just need a random neural
network, that's all I need. So I don't know if I
need that compile step, I'm guessing I kind of do, so I'm going to say this.compile, but I'm just going to
give it an empty object. 'Cause I don't care
about those other things. Let's try this.model.compile So now that I have that,
I should be able to, with my code, I'm kind of
getting a little bit further. Like look at that. Cannot read property
'minimize' of undefined. Because I didn't give it any object. Let's get rid of the compile. Let's see if I get it to work
without the step model.compile Maybe compile only has to
do with the learning process that I'm not using. Great, this.brain.predict
is not a function. So now I have the next thing
that I need to implement. So where is that? That's in bird.js and what is it saying? So I'm able to make the neural network, and then I am calling predict and I'm giving it an array. Perfect. So what I need to do now in my class here is I'm going to write a
function called predict, which will receive an array. Now what I would essentially be doing is now calling this.model.predict
with that array. Only here's the complexity
of using TensorFlow.js. TensorFlow doesn't work with
traditional job script arrays. It works with something called tensors, which are multidimensional arrays, they are multidimensional
things, collections of numbers, that are living in the GPU's memory. And that's for optimization purposes. The operations can
happen faster on the GPU which is optimized for
a lot of matrix map. I don't really need that
optimization, to be clear, because I'm working with
the tiniest thing here, but I still need to convert
this stuff to tensor. So I'm going to need to say,
the xs, I'm going to call this, the inputs are often referred
to as xs, equals tf.tensor, and what do I have, what
is the shape of my array? It's one-dimensional,
right, it's just an array. But the thing is that what
TensorFlow.js expects, so this is my data, it comes
in as an array of numbers. One, two, three, four, five, right? All of this stuff, that's
only four, whatever, you get the idea, there's
another one there. The thing is TensorFlow.js thinks like, you would never be so crazy
as to give me just one thing. Usually you're going to
give me a thousand rows or data samples. So it's actually looking for
a shape that is like this, that would have a bunch of these. So I just need to take my inputs and put them in a 2D tensor. Because it's an array of arrays. So if I do that, tensor2d xs. I think that's all I need to do. Then I need to call predict with the xs. Now that's happening synchronously, I need to get the data off of, oh, and this would be ys, sorry. The ys would be the result
of feeding it forward. So that's exactly this process, predict is the function
to feed this stuff forward and now I have ys. I'm going to do something that
is a little bit ill-advised, I want to be able to look at those values. So right now the ys are a tensor and that's living on the GPU. In order for me to pull
it off the GPU to use it, I need to call a function called data, which is getting the actual data and that's not going to
happen asynchronously. So I need to deal with
a promise or a callback, but because I'm working with
such tiny bits of data here, it's going to be much
simpler for me right now to get this working,
to just say, basically, the outputs equals ys.dataSync So the dataSync function
gives me those outputs. Let's console.log them, and
let's say return outputs also, just so we see what's going on here. So I'm now going to run this, NeuralNetwork.predict at
Bird.think is not defined. What did I miss? (bell dings) Sorry, there's a major error in here that you probably all
noticed, but I didn't. I'm taking that array and
converting it to tensor. Arr is the array, and maybe it would make sense
to me to call these inputs. I'm taking the inputs as a raw array, converting them into a tensor, passing them through the model, pulling out the outputs and returning it. Okay, let's see how that goes. Tensor2d requires shape to be provided when values are flat. Oh, guess what, this has to
be inside an array, right? Because the whole point
is it wants the 2d thing, so it wants that array
inside of that array, okay? So that should fix that. Great, oh look, I'm getting stuff. I'm getting things, oh,
look how slow it is. It's so slow. So it's kind of ironic in a sense. Like I'm using a much more
powerful job script engine for deep learning, that
does all the operations on the GPU with WebGL to optimize it and yet it runs super slow. This is really because I'm a crazy person and kind of doing this in a weird way that TensorFlow.js is
not really designed for. I have all these different models, and I'm copying the data with dataSync. So there's a couple of things I could do. Number one is, let's just for now, and I'll have to come back to this or think how to make this run faster or be more thoughtful or maybe
do this stuff asynchronously. But what I'm going to right now, is I am going to just reduce
the number of birds to half, and I'm also going to
something a little bit silly, which I'm going to
inset tf.setBackend cpu. Because actually the amount of data that I'm working with
is so, so, so, so small, I don't really need to use the GPU, I don't need to copy stuff back and forth, let's just use the CPU,
it's perfectly fast enough. Remember how well it worked when I had my own silly
little JavaScript library? So let's use the CPU and I
think you'll find that now it's still running kind of slow. I have a feeling this console.log is definitely making
things kind of unhappy. So let's take out that console.log and let's refresh and there we go. So now it's performing pretty well. Is it as fast as it was before? I don't know. Can I get it back to 500? But it's better. Ah, but now we have a problem. Copy. So copy is not a function. Because I didn't write a function
to copy a neural network. So what do I mean by
copy a neural network? So normally this isn't something
you would necessarily do, but it is fundamental to neuroevolution. And by copying it, we can say
like, this was a good one, let's make a copy of it, to keep it. And I actually want to copy
it, not keep a pointer to it. Because I could also want to mutate it, and I want to mutate it a bunch of times. So I have to figure out, what is the way to copy
a model in TensorFlow.js. And ultimately if you want to try, the proper way to do this,
in terms of neuroevolution, would not be to just make a copy to go to the next generation, but to take two of them, and make a new one
that's kind of a mixture. When I say mixture, I mean
mixture of all these weights. But I'm just going to copy it. So what I need to do, is I need to go back to this class again, and I need to say copy. So if I say copy, I could just say, return new NeuralNetwork, and I could just put like, inputs_nodes, hidden_nodes, output_nodes, but it wouldn't have the right weights. So let's think about this. Let's do, I'm going to say modelCopy equals this.CreateModel. So I'm creating a model, which is the same, ah, and I need to create model
with, oh, it has that stuff. So I can call CreateModel,
it's going to do the right inputs_nodes, hidden_nodes,
output_nodes, okay. So it's not a copy yet, it's
another random neural network. But really, the architecture's the same. So in a sense I've
copied the architecture, I've made a new neural network
with the same architecture, now I need to get those weights. So luckily for me, there is a ntf.js, there are two functions called
getWeights and setWeights. So let's take a look at those functions. So for example, if I
go to the console here, and I would say something like, can I say, let n equal new NeuralNetwork, what's it, 8.5.2, right? No, 5.2.8, 5.8.2. So I'm making one of those. And so n.model is that model itself. And you can see, it's got
all that data in there. So what I want to do is,
say n.model.getweights. No? Oh, I said mode. I want to say n.model,
can we just click here, model.getWeights. These are all the weight
matrices, why are there four? Shouldn't there just be two? Well, there's four because there are the weights and the biases, one, two, the weights and the biases, three, four. So that is what is here. So all I need to do is take
these weights and copy them. So let's see. I could say, constant weights
equals this.model.getWeights and then I could say modelCopy.setWeights Then I could say return new NeuralNetwork with modelCopy. Now there's a bit of a problem here that I think I'm going to have to do, but this is pretty decent start. Right, make a model of
the same architecture, get the weights from the other one, and put those weights in the
new model, and then return. I think there's going
to be a problem here, but this is kind of the idea. So now, if I were to ... Ah, what I need to do here is, now I'm making a neural network without giving it the input nodes, I need to basically say, if a is an instance of
tf.sequential, is that right? Let's see, so then if I
go back to the console, n.model instanceof tf.sequential false, tf.sequential with a capital, true! Okay, I need to check if
it's tf.sequential object, then this.model equals a. So, in other words, I have two ways of creating a neural network. If I pass the constructor
an existing model, then I just assign it. Otherwise, I create a new one. However, I've really always
got to keep track of what, it's a little, this is
some redundancy here, 'cause the input nodes,
hidden nodes, and output nodes are in the model itself. But let's just add another argument here. And then I can say, also, this is awkward, I could refactor this later. So now I'm also going to always require that I also give it the shape basically, and so here when I'm making
this new neural network with model.Copy, I'm also going to say this
dot, what is it, input_nodes, this.hidden_nodes, this.output_nodes. Okay, so this is me making that new neural network with modelCopy and I'm also just giving it the shape also so that it retains that information. So now, I should be able to go back, and we've got a copy function. Let's speed this up. Ah, can't read! Cannot read property
'setWeights' of undefined. That's not good. (bell dings) Okay, I've sort of made
a mistake in the way that I've designed this program. What I want to say is this.model equals this.createModel and I don't actually want to set it here, I'm going to just make in
the createModel function, which is where I'm going
to make this a variable, const model, and then I'm
going to say model.add, model.add, and then I'm going to say return model. So this function, this could
be leading somewhere else, a static function or something, but it's making the model and it is in setup,
sorry, in the constructor. It is creating it and returning it and placing it in the instance variable, or here it's creating it and returning it and putting it in these copy variables. So that should work better. So let's see. Great, so now I don't
have a mutate function. No problem, no problem, this
is the way to deal with that. Bird.js line 33, Bird.js line 33, let's just not bother at mutate. No worries, don't mutate. Let's speed this up. (upbeat electronic music) All right, let it run for a bit, we can sort of see, did it get better? Kind of, but there's only one, they're all making the
exact same decisions. So two things are wrong here. Number one is, without mutation
I can't get any variety. But there's something else
that I'm pretty sure is wrong, not a 100% sure, but I'm
pretty sure is wrong, so I want to protect myself against this, is I haven't really made a copy. So I think in the case of my
copy function, which is here, this is the weights as tensors. And I am assigning them to the copy. But they are all being
assigned the same tensors. Even though they are multiple models, they're all using the same weight table. So if I were to mutate them, they're all going to mutate
in exactly the same way. So I need to really make
sure I'm copying them. And there is a function
in ts.js called clone, which is to make a new
tensor that is a copy of an existing one. So I think what I should do is, I should say const
weightCopies is a new array, and then for let i equals zero, i is less than weights
index i, i plus plus No, sorry, weights.length, so I'm going to iterate
over all the weights, and I'm going to say
weightCopies index i equals weights index i dot clone. So this is kind of the idea,
like deep copying, I guess, I'm not sure if I'm 100%
accurate about that, but the idea of actually
copying them into a new tensor, so it's not the previous one, and then I should be able to set the weights to weightCopies. So this should guarantee that I really got it copied correctly. So let's do that. I'll run it again, speed
it up a little bit. So it's still converging, kind of, into it alternately
being just one of them, that the one that happened to do the best. But it's not doing that because I haven't properly copied, it's doing that because
there's no mutation. And you can see actually, that it seems to be taking a
little longer to get there. So I think maybe that
copying is better now. Okay, so I do need that mutate function. So what do I need? Where was mutate? So I need a function called mutate, and it's getting argument 0.1, which is presumably the mutation rate. So the process of mutation
in a genetic algorithm is to say, look at the
genetic information, which is basically all of
these weights and biases, and one by one examine them, and at some probability
make it sum adjusted, make it random, change it, mutate it. So that's what I need to do. So it's more involved than just making a copy of the weights. I need to look at every
single weight individually and then mutate it 10% of the time. I'm not sure how to do
this, but let's see, okay. So if I go into neural network and I go into, I made a copy function, now let's look at, sorry, let's look at mutate. So I want to do the same thing. The first thing that I want to do, is I want to get the weights. What I call this, this.model.getWeights. Then I want to maybe have mutatedWeights, is a new, just an array. So I get the current weight tables and then, weight tensors, sorry, and then I'm going to go through
and make mutated weights. So let me once again loop over this array, and I'm going to say, tensor
equals weights index i, 'cause it's a tensor, and
then what I want to do, I want to get the values, so I'm going to use dataSync again. I want the shape, 'cause I'm going to need
that when I convert it back, I think I can just do this. So I need the tensor, which is the data, I'm going to need to retain the shape, because I need to put
it back into a tensor, but I want to mutate them. I don't think I can mutate the tensor without looking at the values. Who knows? So now I'm going to go
through and I'm going to say, let j equals zero, j is
less than values.length, j plus plus. So the actual weight which
I call w is values index j. And what I want to do is say values index j equals a new random weight. So I could do something like
this, right, just give it, and I don't even actually
need the current weight, I could just make a random one, right? I'm mutating, oh, but
only 10% of the time. So this gets a rate, so I could say, if random one is less than that rate, then make a new weight. I do think this is a case where there might be some benefit to rather than picking
a totally new weight, and I don't know what
initialization was used to make the initial random weights, that's probably somewhere
in TensorFlow.js, so there might be some
operation I can call here, but I'm just going to tweak it. So this is why I want to grab that weight, and I'm going to say, weight plus, and I'm going to use P5's
random Gaussian function which basically gives me a random number that's kind of near, with a mean of zero and a standard deviation of one, so it's just going to adjust
that weight a tiny bit. So it's like, kind of, what you might see with graded
descent, dial up, dial down, but I'm just making a total guess, just dial it, some direction. So it's going to equal that, this loop is looking at
individual set of weights, it might be between inputs and hidden or hidden and outputs or
might be one of the biases, either of those weights. And I have it as a tensor. And then I have its values,
now I mutated those values. Aha, I need to make a new tensor, newTensor equals tf.tensor,
and give it those values again, right, the mutated values, with the shape. So this is what's nice. I don't have to worry about
what the original shape was, dataSync is just going
to flatten everything and give me all the numbers,
pretty sure, I think. And then, what I can do, is
then mutate those, it's flat, and I can put it back into
a tensor with the same shape and it should be back
just like it was before. I can say mutatedWeights index
i, is now that new tensor. So I grabbed the tensor, I got
its shape, I got the values, I mutated them, I made a new tensor, and it's going to go in my
new array, of mutated weights. And then when that's done, I can say, this.model.setWeights mutatedWeights. I think this is mutation. There is one mistake here,
which is that tensor, sorry, which is that dataSync is actually not
making a copy of the values. So I'm still going to be stuck with a lot of things pointing
to the same exact data, so if I mutate one and
it's copied for a bunch, all of those are going to
get mutated in the same way, and I want to have a bunch
of different mutations happening in parallel. I discovered this issue around
dataSync in this discussion, and thank you so much
to one of the creators of TensorFlow.js, Nikhil Thorat, who answered my question about this, so dataSync is mutating, I'm mutating the underlying
values in the tensor. Apparently there is an arraySync function that I could use instead of dataSync, that doesn't actually copy, but me, according to a Coding Train viewer actually suggested that
I just call .slice, which is a job script function
for copying the array. So this I really really need in there. Because I really have to make sure that I'm not mutating the
same underlying information. So I should add .slice here. And then I think I have this done, we're about to find out? The only way to find out, there
are other ways to find out, I'm sure, is to let this run for a while and sort of see if it gets anywhere. So I'll let that do
that for a little while. (upbeat electronic music) So I would say that this works. Because, there we go. I have an agent that learned basically, again this is a very simple
problem for it to solve, but this technique that I've now done, there's no reason why
it couldn't be applied to a more complex scenario. So I now have neuroevolution
with TensorFlow.js, but I'm not finished. I could do this in the next part, but there's a huge problem here. I haven't thought at all
about memory management. And if I do tf.memory in the console, ah, I have 128,870 tensors
and I'm using all these bytes, oops sorry, if I start from the beginning
I have that many tensors, and look how it's going up. So I'm leaking memory like crazy. I'm not cleaning up any of
the tensors that I'm using. And so the way to do that, I don't know why I'm coming over here, but it's very important. In addition to the getWeights
and setWeights functions, there are two functions that I really need to
be conscientious about when working with TensorFlow.js. One is tf.tidy, and this is a function that you can basically give it a callback, and any code that you put up in there, it will do the cleanup for you. So it's kind of like TensorFlow.js's garbage collection feature. It's like just put the
coding here and don't worry, we'll take care of disposing
memory that's not used. So this is probably one I
want to use most of the time, but any tensor I could also
individually call dispose on, when I'm not using it anymore. So let me go and find everywhere
in my neural network code that I'm using tensors, and make sure I'm cleaning up memory that's not being used anymore. So I'm in my code, scrolling,
scrolling, scrolling, this should be fine, I don't think I need to dispose models, maybe I do, but here, okay,
oh, look at this mutation. So I'm creating new
tensors, I'm doing stuff, this is definitely a place
where I want to say tf.tidy, and the nice thing is it's
literally as simple as this, I just put everything inside of tf.tidy. So maybe there's a slightly more optimal, better way of doing this,
but this should work. In predict, this one's a little easier, I could say like, I don't
need the xs anymore, so dispose of those. And now once I have the outputs,
I can dispose of the ys. So I can put manual dispose in here, like that's me disposing, but I probably will also just use tf.tidy, same exact thing. But there's a little something
else I need to do here. Because this code is returning something, I need to return the result of tf.tidy. So that should be fine. I feel like the model stuff is fine, this should be all the
cleanup that I need. But let's see. Oh, copy, oh, yeah. I guess let's do tf.tidy
in here also, tf.tidy, let's do this. I also need to say return,
since I'm returning a new model, I mean I could put tf.tidy in
this, let's see if I need to. Let's look at the memory now. So that's a 1000 tensors,
that makes sense, because I have 250 birds and there's four tensors,
four weight matrices, that's a thousand. So 3000, oh, I missed something. So every generation, I'm
keeping some of the old tensors that I don't need anymore. So let's try putting, I
suppose, tf.tidy here, and also return, and
let's see what happens. Ah wait, something undefined. What did I mess up? (bell dings) That didn't seem to work. But I have a better idea, I don't know why that didn't work, there's probably a very good reason, and I will hear from somebody about it, but I think I know what I need to do. Where am I copying? Here, but I have tf.tidy here. I was thinking that once I
make this new neural network I would call dispose
on the existing model. (bell dings) Okay, after examining this for a bit, with a little bit of help from the chat, it looks like the issue
that I have is that, it's a little bit weird,
what I'm doing here, I have a bunch of different arrays, I have birds, saved and savedBirds. So savedBirds is kind of
like the previous generation, and I make all these new
birds for the next generation, put them in the birds array. But the models, the
savedBirds models still exist. So I need to dispose those. So the way I could do that is find the place where I'm
done making the next generation, and iterate over the savedBirds. The way I could do that is, I could also, in the bird object right now, I could add a function called dispose, little silly to do this, but why not, which just says this.brain.dispose and then in the neural network class, I now have a function called dispose, which I would say this.model.dispose. So this is sort of a manual
disposal of all of the memory that that particular model is using. And then somewhere in the sketch, when I call next generation, that makes the next
generation and then I go on, that's in my genetic
algorithm, which is here. So this is making the next generation, and then clearing the savedBirds. So before I clear the savedBirds, I think it's, I could
just use this total value, what I can do now is, I can say savedBirds index i dot dispose. So that should call the dispose function which calls the dispose function which calls the dispose function. So now, I'm looking at a 1000 tensors, and then let me speed this up, so it gets to the next generation faster. Wow, it just got like a good one. Next generation and now, still a 1000 tensors. (whistles) So this is really complete. I mean there's so much
more that I could do, but this is complete in terms of the idea of
neuroevolution with TensorFlow.js and this particular Flappy Bird game. All right, so what's coming next, what could you do, what could I do next? So one thing is, in my
previous coding challenge I saved the model to reload later, and TensorFlow.js has
functions for doing that. Save and load layers, models, I forgot what those functions are called, but they're in there. So I could come back and add that to this. The other thing would
be to apply this idea to different scenarios, different games, to different environments. One of the things that I would like to do is work with steering agents, whether they're finding
food or avoiding predators, that would be an interesting thing to try. So hopefully I'll come
back and do this same idea, but with a different environment that has maybe more complexity to it, with a less obvious result. Because I have a feeling that I could have gotten
a bird to solve this game with just some if statements. Like I could probably guess like, you should always jump,
if you're too high, you shouldn't jump if you're higher than where the pipe opening is, and if you're lower, you should jump. So I could write an if statement instead of using neural network. Thank you for watching. If you make your own version of this, if you have ideas for
how to make this better, please let me know in the comments and also over at thecodingtrain.com, you can submit a link
to your version of this running in the browser, and I'll take a look at those and hopefully share them in a
future video or a livestream. Thanks for watching and goodbye. (whistles) (upbeat electronic music)