Keras with TensorFlow Course - Python Deep Learning and Neural Networks for Beginners Tutorial

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
Hey, I'm Andy from deep lizard. And in this course, we're going to learn how to use Kerris, and neural network API written in Python and integrated with TensorFlow. Throughout the course, each lesson will focus on a specific deep learning concept, and show the full implementation in code using the keras API. We'll be starting with the absolute basics of learning how to organize and pre processed data, and then we'll move on to building and training our own artificial neural networks. And some of these networks will be built from scratch, and others will be pre trained and state of the art models for which we'll fine tune to use on our own custom data sets. Now let's discuss the prerequisites needed to follow along with this course. From a knowledge standpoint, we'll give brief introductions of each deep learning concept that we are going to work with before we go through the code implementation. But if you're an absolute beginner to deep learning, then we first recommend you go through the deep learning fundamentals scores on the tables or Comm. Or if you're super eager to jump into the code, then you can simultaneously take this course and the deep learning fundamentals course, the deep learning fundamentals course will give you all the knowledge you need to know to get acquainted with major deep learning concepts for which then you can come back and implement and code using the keras API. And this course, in regard to coding prerequisites, just some basic programming skills and some Python experience or all that's needed. On peoples.com. You can also find the deep learning learning path. So you can see where this Kerris course falls amidst all the deep lizard deep learning content. Now let's discuss the Course Resources. So aside from the videos here on YouTube, you will also have video and text resources available on peoples are calm. And actually each episode has its own corresponding blog and a quiz available for you to take and test your own knowledge. And you can actually contribute your own quiz questions as well. And you can see how to do that on the corresponding blog for each episode. Additionally, all of the code resources used in this course, are regularly tested and maintained, including updates and bug fixes when needed. Download access to code files used in this course, are available to members of the deep lizard hive mind. So you can check out more about that on tables or calm as well. Alright, so now that we know what this course is about, and what resources we have available, along with the prerequisites needed to get started, let's now talk a little bit more about Kerris itself. Kerris was developed with a focus on enabling fast user experimentation. So this allows us to go from idea to implementation and very few steps. Aside from this benefit, users often wonder why choose Kerris as the neural network API to learn or in general, which neural network API should they learn? Our general advice is to not commit yourself just to only learning one and sticking with that one forever, we recommend to learn multiple neural network API's. And the idea is that once you have a fundamental understanding of the underlying concepts, then the minor syntactical and implementation differences between the neural network API's shouldn't really be that hard to catch on to once you have at least one under your belt already, especially for job prospects. Knowing more than one neural network API will show your experience and allow you to compare and contrast the differences between API's and share your opinions for why you think that certain API's may be better for certain problems, and others for other problems. Being able to demonstrate this will make you a much more valuable candidate. Now, we previously touched on the fact that Kerris is integrated with TensorFlow. So let's discuss that more. Now. Historically, Kerris was a high level neural network API that you could configure to run against one of three separate lower level API's. And those lower level API's were TensorFlow viano and cntk. Later, though, Kerris became fully integrated with the TensorFlow API, and is no longer a separate library that you can choose to run against one of the three back end engines that we discussed previously. So it's important to understand that cares is now completely integrated with the TensorFlow API. But in this course, we are going to be focusing on making use solely of that high level Kerris API without necessarily making much use of the lower level TensorFlow API. Now, before we can start working with cares, then we first have to obviously get it downloaded and installed onto our machines. And because Kerris is fully integrated with TensorFlow, we can do that by just installing TensorFlow Kerris will come completely packaged with the TensorFlow installation. So the installation procedure is as simple as running pip install TensorFlow from your command line, you might just want to check out the system requirements on tensor flows website to make sure that your specific system we The requirements needed for TensorFlow to install. Alright, so we have one last talking point before we can get into the actual meat of the coding. And that is about GPU support. So the first important thing to note is that a GPU is not required to follow this course, if you're running your machine only on a CPU, then that is totally fine for what we'll be doing in the course. If however, you do want to run your code on a GPU, then you can do so pretty easily. After you get through the setup process for setting up your GPU to work with TensorFlow, we have a full guide on how to get the GPU setup to work with TensorFlow on deep lizard Comm. So if you are interested in doing that, then head over there to go through those steps. But actually, I recommend just going through the course with a CPU if you're not already set up with a GPU. And like I said, the all the code will work completely fine, run totally fine using only a CPU. But then after the fact, after you go through the course successfully, then get the steps in order to work with the GPU if you have one, and then run all the code that you have in place from earlier, run it on the GPU, the second go round and just kind of look and see the kind of efficiency and speed ups that you'll see in the code. Alright, so that's it for the Kerris introduction. Now we're finally ready to jump in to the code. Be sure to check out the blog and other resources available for this episode on the poser.com as well as the dibbles at hive mind where you can gain access to exclusive perks and rewards. Thanks for contributing to collective intelligence. Now, let's move on to the next episode. Hey, I may be from deep lizard. In this episode, we'll learn how to prepare and process numerical data that will later use to train our very first artificial neural network. To train any neural network and a supervised learning task, we first need a data set of samples along with the corresponding labels for those samples. When referring to the word samples, we're simply just talking about the underlying data set, where each data point in the set is referred to as a sample. If we were to train a model to do sentiment analysis on headlines from a media source, for example, then the labels that correspond to each headline sample would be positive or negative. Or say that we were training an artificial neural network to identify images as cats or dogs, well, then that would mean that each image would have a corresponding label of cat or dog. Note that in deep learning, you may also hear samples referred to as inputs or input data. And you may hear labels referred to as targets or target data. When preparing a data set, we first need to understand the task for which the data will be used. In our example, we'll be using our data set to train an artificial neural network. So once we understand this, then we can understand which format the data needs to be in, in order for us to be able to pass the data to the network. The first type of neural network that we'll be working with is called a sequential model from the cares API. And we'll discuss more details about the sequential model in a future episode. But for now, we just need to understand what type of data format the sequential model expects, so that we can prepare our dataset accordingly. The sequential model receives data during training whenever we call the fit function on it. And again, we're going to go into more details about this function in the future. But for now, let's check out what type of data format the fit function expects. So if we look at the fit documentation here, on tensor flows website, we can see the first two parameters that the fit function expects are x and y. So x is our input data, our samples, in other words, so this function expects x our input data to be in a NumPy array, a TensorFlow tensor, a dict, mapping, a TF dot data data set or a karass generator. So if you're not familiar with all of these data types, that's okay, because for our first example, we are going to organize our data to be in a NumPy array. So the first option here in the documentation, so this is for our input samples, but also the Y parameter that is expected by the fit function is our data set that contains the corresponding labels for our samples. So the target data. Now the requirement for why is that it is formatted as one of the above formats that we just discussed for x, but y needs to be in the same format as x. So we can't have our samples contained in a NumPy array, for example, and then have y Our target data or our labels for those samples and the TensorFlow tensor. So the format of x and y, both need to match. And we're going to be putting both of those into NumPy arrays. Alright, so now we know the data format that the model expects. But there's another reason that we may want to transform or process our data. And that is to put it in a format that will make it easier or more efficient for the model to learn from. And we can do that with data normalization or standardization techniques, data processing, and deep learning will vary greatly depending on the type of data that we're working with. So to start out, we are going to work with a very simple numerical data set to train our model. And later, we'll get exposure to working with different types of data as well. Alright, so now we're ready to prepare and process our first data set. So we are in our Jupyter Notebook. And the first step is to import all the packages that we'll be making use of including NumPy random and some modules from scikit learn. Next, we will create two list one called train samples, one called train labels, and these lists will hold the corresponding samples and labels for our data set. Now about this data set, we are going to be working with a very simple numerical data set. And so for this task, we're actually going to create the data ourselves. Later, we'll work with more practical examples and realistic ones where where we won't be creating the data, but instead downloading it from some external source. But for now, we're going to create this data ourselves for which will train our first artificial neural network. So as a motivation for this kind of dummy data, we have this background story to give us an idea of what this data is all about. So let's suppose that an experimental drug was tested on individuals ranging from age 13 to 100. In a clinical trial, this trial had 2100 participants total, half of these participants were under the age of 65. And half were 65 years or older. And the conclusions from this trial was that around 95% of the patients who were in the older population, so 65 or older, the 95% of those patients experienced side effects. And around 95% of patients who were under 65 years old, experienced no side effects. Okay, so this is a very simplistic data set. And so that is the background story. Now in this cell, here, we're going to go through the process of actually creating that data set. So what this is, is this first for loop is going to generate both the approximately 5% of younger individuals who did experience side effects, and the 5% of older individuals who did not experience side effects. So within this first for loop, we are first generating a random number or a random integer rather between 13 and 64. And that is constituting as a younger individual who is under 65 years of age, or yet who's under 65 years of age. And we are going to then append this number to the train samples list. And then we append a one to the train labels list. Now a one is representing the fact that a patient did experience side effects. And a zero would represent a patient who did not experience side effects. So then similarly, we jumped down to the next line. And we are generating a random integer between 65 in 100, to represent the older population. And we are doing that Remember, this is in our first for loop. So it's only running 50 times. So this is kind of the outlier group, these 5% of older individuals who did not experience side effects. So we then take that sample appended to the train samples list, and append a zero to the corresponding train labels list. Since these patients were the older patients who did not experience side effects. So then if we jumped down to the next for loop, very similarly, we have pretty much the same code here, except for this is the bulk of the group. And this for loop, we're running this for the 95% of younger individuals who did not experience side effects, as well as the 95% of older individuals who did experience side effects. So generating a random number between 13 and six before and then appending that number, representing the age of the younger population to the train samples list, and appending the label of zero. Since these individuals did not experience side effects, we're appending zero to the train labels list. Similarly, we do the same thing for the older individuals from 65 to 100. Except for since the majority of these guys did experience side effects, we are appending a one to the train The labels list. So just to summarize, we have the samples list that contains a bunch of integers ranging from 13 to 100. And then we have the train labels list that has the labels that correspond to each of these individuals with the ages 13 to 100. The labels correspond to whether or not these individuals experience side effects. So a samples list containing ages, a labels list containing zeros and ones representing side effects, or no side effects. And just to get a visualization of the samples, here, we are printing out all of the samples in our list. And we see that these are just all integers, like we'd expect, ranging from 13 to 100. And then correspondingly, if we run our train labels list and print out all of the data there, then we can see this list contains a bunch of zeros and ones. Alright, so now the next step is to take these lists and then start processing them. So we have our data generated. Now we need to process it to be in the format for which we saw the fit function expects, and we discussed the fact that we are going to be passing this data as NumPy arrays to the fit function. So our next step is to go ahead and do that transformation here, where we are taking the train labels list and making that now a NumPy array. Similarly, doing the same thing with the train samples list. And then we use the shuffle function to shuffle both are trained labels and trained samples respective to each other so that we can get rid of any imposed order from the data generation process. Okay, so now the data is in the NumPy array format that is expected by the fit function. But as mentioned earlier, there's another reason that we might want to do further processing on the data. And that is to either normalize or standardize it so that we can get it in such a way that the training of the neural network might become quicker or more efficient. And so that's what we're doing in this cell. Now we are using this min max scalar object to create a feature range ranging from zero to one, which we'll then use in this next line to rescale our data from the current scale of 13 to 100, down to a scale of zero to one, and then this reshaping that we're doing here is just a formality because the fit transform function doesn't accept one D data by default. Since our data is one dimensional, we have to reshape it in this way to be able to pass it to the fit transform function. So now if we print out the elements in our new scaled train samples variable, which that's what we're calling our new scaled samples, then we can print them out and see that now the individual elements are no longer integers ranging from 13 to 100. But instead, we have these values ranging anywhere between zero and one. So at this point, we have generated some raw data, and then processed it to be in a NumPy array format that our model will expect. And then rescale the data to be on a scale between zero and one, and an upcoming episode, we'll use this data to train our first artificial neural network, be sure to check out the blog and other resources available for this episode on deep lizard calm, as well as the deep lizard hive mind where you can gain access to exclusive perks and rewards. Thanks for contributing to collective intelligence. Now, let's move on to the next episode. Hey, I'm Andy from Blizzard. And this episode will demonstrate how to create an artificial neural network using a sequential model from the keras API integrated within tensor flow. In the last episode, we generated data from an imagined clinical trial. And now we'll create an artificial neural network for which we can train on this data. Alright, so first things first, we need to import all of the TensorFlow modules that we'll be making use of to build our first model. And that includes everything that you see here except for actually the last two of atom and categorical cross and vitry. categorical cross entropy, rather, those two are going to be used when we train the model, not when we build it. But we're going ahead and bringing all the imports in now. And then next, if you are running this code on a GPU, then you can run this cell which will allow you to make sure that TensorFlow is correctly identifying your GPU as well as enable memory growth. And there's a few lines on the blog where you can check out what exactly that means and why you might want to do this. But if you are running a GPU then go ahead and run this cell. So next, this is the The actual model that we are building this is a sequential model. And that is kind of the most simplest type of model that you can build using Kerris or TensorFlow. And a sequential model can be described as a linear stack of layers. So if you look at how we're creating the model here, that's exactly what it looks like. So we are initializing the model as an instance of the sequential class. And we are passing in a list of layers here. Now, it's important to note that this first dense layer that we're looking at, that is actually the second layer overall. So this is the first hidden layer. And that's because the input layer, we're not explicitly defining using Kerris, the input data is what creates the input layer itself. So the way that the model knows what type of input data to expect, or the shape of the input data, rather, is through this input shape parameter that we pass to our first dense layer. So through this, the model understands the shape of the input data that it should expect. And then therefore it accepts that shape, have input data, and then passes that data to the first hidden layer, which is this dense layer here in our case, now, we are telling this dense layer that we want it to have 16 units. These units are also otherwise known as nodes or neurons. And the choice of 16 here is actually pretty arbitrary. This model overall is very simple. And with the arbitrary choice of nodes here, it's actually going to be pretty hard to create a simple model, at least, that won't do a good job at classifying this data, just given the simplicity of the data itself. So we understand that we're passing in, or that we're specifying 16 units for this first hidden layer, we are specifying the input shape, so that the model knows the shape of the input data to expect and then we are stating that we want the relu activation function to follow this dense layer. Now, this input shape parameter is only specified for our first hidden layer. So after that, we have one more hidden, dense layer. This time, we are arbitrarily setting the number of units for this dense layer to be 32. And again, we're following the layer by the activation function value. And then lastly, we specify our last output layer or our last layer, which is our output layer. This is another dense layer, this time with only two units, and that is corresponding to the two possible output classes. Either a patient did experience side effects, or the patient did not experience side effects. And we're following this output layer with the softmax function, which is just going to give us probabilities for each output class. So between whether or not a patient experience side effects or not, we will have an output probability for each class, letting us know which class is more probable for any given patient. And just in case, it's not clear, this dense layer here is what we call a densely connected layer or a fully connected layer, probably the most well known type of layer and an artificial neural networks. Now in case you need any refresher on fully connected layers, or activation functions, or anything else that we've discussed up to this point, then just know that all of that is covered in the deep learning fundamentals course, if you need to go there and refresh your memory on any of these topics here. Alright, so now we'll run this cell to create our model. And now we can use model dot summary to print out a visual summary of the architecture of the model we just created. So looking here, we can just see the visual representation of the architecture that we just created in the cell above. All right, so now we have just finished creating our very first neural network using this simple and intuitive sequential model type. In the next episode, we will see how we can use the data that we created last time to train this network. Be sure to check out the blog and other resources available for this episode on depot's or.com as well as the tables at hive mind where you can gain access to exclusive perks and rewards. Thanks for contributing to collective intelligence. Now let's move on to the next episode. Hey, I'm Andy from deep lizard. In this episode, we'll see how to train an artificial neural network using the keras API integrated with TensorFlow. In previous episodes, we went through steps to generate data and also build an Artificial neural network. So now we'll bring these two together to actually train the network on the data that we created and processed. Alright, so picking up where we were last time in our Jupyter Notebook, make sure that you still have all of your imports included and already ran so that we can continue where we were before. So first we have after building our model, we are going to call this model compile function. And this just prepares the model for training. So it gets everything in order that's needed before we can actually train the model. So first, we are specifying to the compile function, what optimizer that we want to use, and we're choosing to use the very common optimizer atom with a learning rate of 0.0001. And next we specify the type of loss that we need to use, which is in our case, we're going to use sparse categorical cross in but we're going to use sparse categorical cross entropy. And then lastly, we specify what metrics we want to see. So this is just for the model performance, what we want to be able to judge our model by and we are specifying this list, which just includes accuracy, which is a very common way to be able to evaluate model performance. So if we run this cell, alright, so the model has been compiled and is ready for training. And training occurs whenever we call this fit function. Now recall earlier in the course, we actually looked at the documentation for the split function, so that we knew how to process our input data. So to fit the first parameter that we're specifying is x here, which is our input data, which is currently stored in this scaled at train samples variable, then y, which is our target data, or our labels are labels are currently stored in the train labels variable. So we are specifying that here. Next, we specify our batch size that we want to use for training. So this is how many samples are included in one batch to be passed and processed by the network at one time. So we're setting this to 10. And the number of epochs that we want to run, we're setting this to 30. So that means that the model is going to process or train on all of the data in the data set 30 times before completing the total training process. Next, we're specifying this shuffle, shuffle parameter which we are setting to true. Now, by default, this is already set to true, but I was just bringing it to your attention to show or to make you aware of the fact that the data is being shuffled by default, when we pass it to the network, which is a good thing, because we want any order that is inside of the dataset to be kind of erased before we pass the data to the model so that the model is not necessarily learning anything about the order of the data set. So this is true by default. So we don't necessarily have to specify that I was just letting you know. And actually, we'll see something about that in the next episode about why this is important regarding validation data, but we'll see that coming up. The last parameter that we specify here is verbose, which is just an option to allow us to see output from whenever we run this fit function. So we can either set it to 01, or two, two is the most verbose level in terms of output messages. So we are setting that here so that we can get the highest level of output. So now let's run this cell so that training can begin. All right, so training has just stopped. And we have run for 30 epochs. And if we look at the progress of the model, so we're starting out on our first epoch, our loss value is currently point six, eight and our accuracy is 50%. So no better than chance. But pretty quickly looking at the accuracy, we can tell that it is steadily increasing all the way until we get to our last a POC, which we are yielding 94% accuracy. And our loss has also steadily decreased from the point six five range to now being at point two seven. So as you can see this model train very quickly, with each epoch taking only under one second to run. And within 30 epochs, we are already at a 94% accuracy right. Now although this is a very simple model, and we were training it on a very simple data, we can see that without much effort at all, we were able to yield pretty great results in a relatively quick manner of time as well. In subsequent episodes, we'll demonstrate how to work with more complex models as well as more complex data. But for now, hopefully this example served the purpose of encouraging you on how easy it is to get started with Kerris. Be sure to check out the blog and other resources available for this episode on V bowser.com. As well as the deep lizard hive mind where you can gain access to exclusive perks and rewards. Thanks for contributing to collective intelligence. Now let's move on to the next episode. Hey, I'm Andy from deep lizard. In this episode, we'll demonstrate how we can use tensor flows keras API to create a validation set on the fly during training. Before we demonstrate how to build a validation set using Kerris, let's first talk about what exactly a validation set is. So whenever we train a model, our hope is that when we train it that we see good results from the training output, that we have low loss and high accuracy. But we don't ever train a model just for the sake of training it, we want to take that model and hopefully be able to use it in some way on data that it wasn't necessarily exposed to during the training process. And although this new data is data that the model has never seen before, the hope is that the model will be good enough to be able to generalize well on this new data and give accurate predictions for it, we can actually get an understanding of how well our model is generalizing by introducing a validation set during the training process to create a validation set. Before training begins, we can choose to take a subset of the training set, and then separate it into a separate set labeled as validation data. And then during the training process, the model will only train on the training data, and then we'll validate on the separated validation data. So what do we mean by validating? Well, essentially, if we have the addition of a validation set, then during training, the model will be learning the features of the training set, just as we've already seen. But in addition, in each epoch, after the model has gone through the actual training process, it will take what it's learned from the training data, and then validate by predicting on the data in the validation set, using only what it's learned from the training data, though. So then during the training process, when we look at the output of the accuracy and loss, not only will we be seeing that accuracy and loss computed for the training set, we'll also see that computed on the validation set. It's important to understand though, that the model is only alerting on or training on the training data. It's not taking the validation set into account during training. The validation set is just for us to be able to see how well the model is able to predict on data that it was not exposed to during the training process. In other words, it allows us to see how general our model is how well it's able to generalize on data that is not included in the training data. So knowing this information will allow us to see if our model is running into the famous overfitting problem. So overfitting occurs when the model has learned the specific features of the training set really well, but it's unable to generalize on data it hasn't seen before. So if while training, we see that the model is giving really good results for the training set, but less than good results for the validation set, then we can conclude that we have an overfitting problem, and then take the steps necessary to combat that specific issue. If you'd like to see the overfitting problem covered in more detail, then there is an episode for that in the deep learning fundamentals course. Alright, so now let's discuss how we can create and use a validation set with a karass sequential model, there's actually two ways that we can create and work with validation sets with a sequential model. And the first way is to have a completely separate validation set from the training set. And then to pass that validation set to the model in the fit function, there is a validation data parameter. And so we can just set that equal to the structure that is holding our validation data. And there's a write up in the corresponding blog for this episode that contains more details about the format that that data needs to be in. But we're going to actually only focus on the second way of creating and using a validation set. This step actually saves us a step because we don't have to explicitly go through the creation process, the validation set, instead, we can get Kerris to create it for us. Alright, so we're back in our Jupyter Notebook right where we left off last time. And we're here on the model dot fit function. And recall, this is what we use last time to train our model. Now, I've already edited this cell to include this new parameter, which is validation split. And what validation split does is it does what it sounds like it splits out a portion of the training set into a validation set. So we just set this to a number between zero and one. So just a fractional number to tell Kerris How much of the training set we need to split out into the validation set. So here I'm splitting out 10% of the training set. So it's important to note that whenever we do this, the validation set is completely held out of the training set. So the training samples that we remove from the training set into validation set are no longer contained within the training data any longer. So using this approach, the validation set will be created on the fly whenever we call the fit function. Now, there's one other thing worth mentioning here. And remember last time, I discussed this shuffle equals true parameter. And I said that by default, the training set is shuffled whenever we call fit. So this shuffle equals true is already set by default. But I was just bringing it up to let you know that that the training set is being shuffled. So that is a good thing, we want the training set to be shuffled. But whenever we call validation split in this way, this split occurs before the training set is shuffled, meaning that if we created our training set and say, we put all of the sick patients first and then the non sick patients second, and then we say that we want to split off the last 10% of the training data to be our validation data, it's going to take the last 10% of the training data. And therefore it could just take all of the the second group that we put in the training set and not get any of the first group. So I wanted to mention that because although the training data is being shuffled with the fit function, if you haven't already shuffled your training data before you pass it to fit, then you also use the validation split parameter, it's important to know that your validation set is going to be the last X percent of your training set and therefore may not be shuffled and may yield some strange results because you think that everything has been shuffled when really, it's only the training set has been shuffled after the validation set has been taken out. So just keep that in mind the way that we created our training set. Before this episode, we actually shuffled the training data before it's ever passed to the fit function. So in the future, whenever you're working with data, it's a good idea to make sure that your data is also shuffled beforehand, especially if you're going to be making use of the validation split parameter to create a validation set. Alright, so now we'll run this cell one more time calling the fit function. But this time, not only will we see loss and accuracy metrics for the training set, we'll also see these metrics for the validation set. Alright, so the model has just finished running it's 30 epochs. And now we see both the loss and accuracy on the left hand side, as well as the validation loss and validation accuracy on the right hand side. So we can see, let's just look at the accuracy between the two. They're both starting at around the same 50% Mark, and going up gradually around the same rate. So we just scroll all the way to our last epoch, we can see that the accuracy and validation accuracy are pretty similar with only 1% difference between the two. And yet the loss values are similar as well. So we can see in this example that our model is not overfitting, it is actually performing pretty well or just as well rather on the validation set as it is on the training set. So our model is generalizing well. If however, we saw that the opposite case was true, and our validation accuracy was seriously lagging behind our training accuracy, then we know that we have a overfitting problem and we would need to take steps to address that issue. Alright, so we've now seen how to train the model how to validate the model, and how to make use of both training and validation sets. In the next episode, we're going to see how to make use of a third data set that test data set. To use the model for inference. Be sure to check out the blog and other resources available for this episode on depot's or.com as well as the deep lizard hive mind where you can gain access to exclusive perks and rewards. Thanks for contributing to collective intelligence. Now let's move on to the next episode. Hey, I'm Andy from deep lizard. In this episode, we'll see how we can use a neural network for inference to predict on data from a test set using TensorFlow keras API. As we touched on previously, whenever we train a model, the hope is that we can then take that model and use it on new data that it has not seen before during training, and hopefully the model is able to generalize well. And give us good results on this new data. So as a simple example, suppose that we trained a network to be able to identify images of cats or dogs. And so during the training process, of course, we had a training set that, say, we downloaded from a website with 1000s of images of cats and dogs. So the hope is later that if we wanted to, we could maybe build a web app, for example. And we could have people from all over the world, submit their dog and cat photos and have our model, tell them with high accuracy, whether or not their animal is a cat or a dog. So I don't know why anyone would actually make that web app. but you get the point. The hope is that even though the images that are being sent in from people around the world have their own cats and dogs, even though those weren't included in the training set that the model was originally trained on. Hopefully, the models able to generalize well enough to understand from what it's learned about dog and cat features, that it can predict that Mandy's dog is actually a dog and not a cat, for example, we call this process inference. So the model takes what it learned during training, and then uses that knowledge to infer things about data that it hasn't seen before. In practice, we might hold out a subset of our training data to put in a set called the test set. Typically, after the model has been trained and validated, we take the model and use it for inference purposes against the test set, just as one additional step to the validation to make sure that the model is generalizing Well, before we deploy our model to production. So at this point, the model that we've been working with over the last few episodes has been trained and validated. And given the metrics that we saw during the validation process, we have a good idea that the model is probably going to do a pretty good job at inference on the test set as well. In order to conclude that though, we first would need to create a test set. So we're going to do that now. And then after we create the test set, then we'll use the model for inference on it. Alright, so we are back in our Jupyter Notebook. And now we're going to go through the process of creating the test set. And actually, if you just glance at this code here, you can see that this whole process of setting up the the samples and labels list and then generating the data from the imagine clinical trial that we discussed in a previous episode, we're then taking that generated data and putting it into a NumPy array format, then shuffling that data, and then scaling the data to be on a scale from zero to one, rather than from the scale of 13 to 100. So actually, that is the same exact process using almost the same exact code, except for we're working with our tests, labels, and test samples, variables rather than train labels and train samples. So we're not going to go line by line through this code. If you need a refresher, go check out the earlier episode where we did the exact same process for the training set. The important thing to take from this process, though, is that the test set should be prepared and processed in the same format as the training data was. So we'll just go ahead and run the cells to test and are to create and process the test data. And now, we are going to use our model to predict on the test data. So to obtain predictions from our model, we call predict on the model that we created in the last couple of episodes. So we are calling model dot predict. And we are first passing in this parameter x which we're setting equal to our scaled test samples. So that is what we created in just the line above, where we scaled our test samples to be on a scale from zero to one. So this is the data that we want our model to predict on. Then we specify the batch size, and we are setting the batch size equal to 10, which is the exact same batch size that we use for our training data whenever we train the model as well. And then the last parameter we're specifying is this verbose parameter, we're setting this equal to zero, because during predicting there is not any output from this function that we actually care about seeing or that is going to be any use to us at the moment. So we're setting that equal to zero to get no output. Alright, so then if we run this, so then our model predicts on all of the data in our test set. And if we want to have a visualization of what each of these predictions from the model looks like for each sample, we can print them out here. So looking at these predictions, the way that we can interpret this is for each element within our test set. So for each sample in our tests that we are getting a probability that maps to either The patient not experiencing a side effect, or the patient experiencing a side effect. So for the first sample in our test set, this prediction says that the model is a 92, or is assigning a 92% probability to this patient, not experiencing a side effect, and just a around 8% probability of the patient experiencing a side effect. So recall that we said, no side effect experience was labeled as a zero, and a side effect experienced was labeled as a one. So that is how we know that this particular probability maps to not having a side effect because it's in the zeroeth index. And this specific probability maps to having a side effect because it is in the first index. So if we're interested in seeing only the most probable prediction for each sample in the test set, then we can run this cell here, which is taking the predictions and getting the index of the prediction with the highest probability. And if we print that out, then we can see that these are a little bit easier to interpret than the previous output. So we can see for the first sample that the prediction is zero, the second sample is a one. And just to confirm, if we go back up here, we can see that the first sample indeed has the higher probability of a label of zero, meaning no side effects. And the second sample has a higher probability of one meaning that the patient did experience a side effect. So from these prediction results, we're able to actually see the underlying predictions. But we're not able to make much sense of them in terms of how well the model did add these predictions, because we didn't supply the labels to the model during inference in the same way that we do during training. This is the nature of inference. A lot of times inference is occurring once the model has been deployed to production. So we don't necessarily have correct labels for the data that the model is inferring from. If we do have corresponding labels for our test set, though, which in our case, we do, because we are the ones who generated the test data, then we can visualize the prediction results by plotting them to a confusion matrix. And that'll give us an overall idea at how accurate our model was at inference on the test data. We'll see exactly how that's done in the next episode. Be sure to check out the blog and other resources available for this episode on depot's or.com as well as the tables at hive mind where you can gain access to exclusive perks and rewards. Thanks for contributing to collective intelligence. Now let's move on to the next episode. Hey, I'm Andy from deep lizard. In this episode, we'll demonstrate how to use a confusion matrix to visualize prediction results from a neural network during inference. In the last episode, we showed how we could use our train model for inference on data contained in a test set. Although we have the labels for this test set, we don't pass them to the model during inference, and so we don't get any type of accuracy readings for how well the model does on the test set. Using a confusion matrix, we can visually observe how well a model predicts on test data. Let's jump right into the code to see exactly how this is done. We'll be using scikit Learn to create our confusion matrix. So the first thing we need to do is import the necessary packages that we'll be making use of next week create our confusion matrix by calling this confusion matrix function from scikit learn. And we pass in our test labels as the true labels. And we pass in our predictions as the predictions and that the confusion matrix at specs and recall this rounded predictions variable, as well as the test labels. These were created in the last episode. So rounded predictions recall was when we use the arg max function to select only the most probable predictions. So now our predictions are in this format. And then our labels are zeros and ones that correspond to whether or not a patient had side effects or not. So next we have this plot confusion matrix function. And this is directly copied from socket learns website. There's a link to the site on the corresponding blog where you can copy this exact function. But this is just a function that socket learn has created to be able to easily plot in our notebook, the confusion matrix, which is going to be the actual visual output that we want to see. So we just run this cell to define that function. And now we create this list that has the labels that we will use on our computer. In matrix, so we want, the labels have no side effects and had side effects. Those are the corresponding labels for our test data, then we're going to call the plot confusion matrix function that we just brought in and defined above from scikit learn. And to that we are going to pass in our confusion matrix. And the classes for the confusion matrix which we are specifying cm plot labels, which we defined just right above. And lastly, just the title that is going to be the title to display above the confusion matrix. So if we run this, then we actually get the confusion matrix plot. Alright, so we have our predicted labels on the x axis and our true labels on the y axis. So the way we can read this is that we look and we see that our model predicted that a patient had no side effects 10 times when the patient actually had a side effect. So that's incorrect predictions. On the flip side, though, the model predicted that the patient had no side effects 196 times that the patient indeed had no side effects. So the this is the correct predictions. And actually, generally reading the confusion matrix, looking at the top left to the bottom right diagonal, these squares here and blue going across this diagonal are the correct predictions. So we can see total that the model predicted 200 plus 196. So 396 correct predictions out of a total of 420, I think, yes. So all these numbers added up equal 420 396 out of 420 predictions were correct. So that gives us about a 94% accuracy rate on our test set, which is equivalent to what we were seeing for our validation, accuracy rate during training. So as you can see, a confusion matrix is a great tool to use to be able to visualize how well our model is doing edits predictions, and also be able to drill in a little bit further to see for which classes it might need some work, be sure to check out the blog and other resources available for this episode on the poser.com as well as the deep lizard hive mind where you can gain access to exclusive perks and rewards. Thanks for contributing to collective intelligence. Now let's move on to the next episode. Hey, I'm Andy from deep lizard. And this episode will demonstrate the multiple ways that we can save and load a karass sequential model. We have a few different options when it comes to saving and loading a karass sequential model. And these all work a little bit differently from one another. So we're going to go through all of the options. Now. We've been working with a single model over the last few episodes. And we're going to continue working with that one. Now, I've printed out a summary of that model to refresh your memory about which one we've been working with. But just make sure that in your Jupyter Notebook that you have that model already created, because we're going to now show how we can save that model. Alright, so the first way to save a model is just by calling the Save function on it. So to save, we pass in a path for which we want to save our model, and then the model name or the file name that we want to save our model under with the H five extension. So this h5 file is going to be where the model is stored. And this code here is just a condition that I'm checking to see if the model is not already saved to disk first, then save it because I don't want to continue saving the model over and over again, on my machine if it's already been saved. So that's what this this condition is about. But this model dot save function is the first way that we can save the model. Now when we save using this way, it saves the architecture of the model, allowing us to be able to recreate it with the same number of learnable parameters and layers and nodes etc. It also saves the weights of the model. So if the models already been trained, then the weights that it has learned and optimized for are going to be in place within this saved model on disk. It also saves the training configuration. So things like our loss and our optimizer that we set whenever we compile the model, and the state of the optimizer is also saved. So that allows us if we are training the model, and then we stop and save the model to disk, then we can later load that model again and pick up training where we left off because the state of the optimizer will be in that saved state. So this is the most comprehensive option when it comes to saving the model because it saves everything, the architecture, the learnable parameters and the state of the model. where it left off with training. So if we want to load a model later that we previously saved to disk, then we need to first import this load model function from TensorFlow Kerris dot models. And then we create a variable. In this case, I'm calling it a new model, and then setting it to load model and then pointing to where our saved model is on disk. So then, if we run that, and then look at a summary of our new model, we can see Indeed, it is an exact replica in terms of its architecture as the original model up here that we had previously saved to disk. Also, we can look at the weights of the new model, we didn't look at the weights ahead of time to be able to compare them directly. But this is showing you that you can inspect the weights to look and comparatively see that the weights are actually the same as the previous models weights. If you were to have taken a look at those beforehand. We can also look at the optimizer just to show you that although we never set an optimizer explicitly for our new model, because we are loading it from the saved model, it does indeed use the atom optimizer that we set a while back whenever we compiled our model for training. Alright, so that's it for the first saving and loading option. And again, that is the most comprehensive option to save and load everything about a particular model. The next option that we'll look at is using this function called to JSON. So we call model.to. json, if we only need to save the architecture of the model. So we don't want to set its weights or the training configuration, then we can just save the model architecture by saving it to a JSON string. So I'm using this example here, creating a variable called JSON string, setting it equal to model.to. json. And remember, model is our original model that we've been working with so far up to this point. So we call to JSON. And now if we print out this JSON string, then we can see we get this string of details about the model architecture. So it's a sequential model. And then it's got the layers organized, with the individual dense layers and all the details about those specific layers from our original model. Now, if at a later point, we want to create a new model with our older models, architecture, then if we save it to a JSON string, then we can import the model from JSON function from TensorFlow Kerris dot models. And now we're creating a new variable called model architecture. And we are loading in the JSON string using the model from JSON function. So now we have this new model, which I'm just calling model architecture. And if we look at the summary of that, then again, we can see that this is identical to the summary of the original model. So we have a new model in place now, but we only have the architecture in place. So you would have to retrain it to update its weights. And we would need to compile it to get an optimizer and our loss and everything like that defined, this only creates the model from an architecture standpoint. Before moving on to our third option, I just wanted to mention a brief point that we can go through this same exact process, but using a YAML string instead of a JSON string. So the function to create a gamble string is just model.to a gamble instead of to JSON. And then the function to load a YAML string is model from a gamma model from a YAML instead of model from JSON. Alright, so our next option to save a model is actually just to save the weights of the model. So if you only need to save the weights, and you don't need to save the architecture, nor any of the training configurations, like the optimizer or loss, then we can save solely models weights by using the Save weights function. So to do that, we just call model dot save weights. And this looks exactly the same as when we called model dot save, we're just passing a path on disk to where to save our model along with the file name ending with an H five extension. So I'm calling this my model weights dot h five. And again, we have this condition here where I'm just checking if this H five file has already been saved to disk, otherwise, I'm not going to keep saving it over and over again. Now, the thing with this is that when we save only the weights, if we want to load them at a later time, then we don't have a model already in place because we didn't save the model itself. We only save the weights. So to be able to bring in our weights to a new model then we would then need to create a second model at that point with the same architecture. And then we could load the weights. And so that's what we're doing in this cell. I'm defining this model called model two. And it is the exact same model from an architecture standpoint as the first model. So if we run this, then at that point, we have the option to load weights into this model. And the shape of these weights is going to have to match the shape of what this model architecture is essentially. So we couldn't have a model with five layers here to find, for example, and load these weights in because there wouldn't be a direct mapping of where these particular weights should be loaded. So that's why we have the same exact architecture here as our original model. So to load our weights, we call load weights. And we point to the place on disk where our weights are saved. And then we can call get weights on our new model, and see that our new model has been populated with weights from our original model. Alright, so now you know all the ways that we can say various aspects of a karass sequential model. Be sure to check out the blog and other resources available for this episode on people's are.com as well as the deep lizard hive mind where you can gain access to exclusive perks and rewards. Thanks for contributing to collective intelligence. Now, let's move on to the next episode. Hey, I may be from people's or, in this episode, we'll go through all the necessary image preparation and processing steps needed to train our first convolutional neural network. Our goal over the next few episodes will be to build and train a convolutional neural network that can classify images as cats or dogs, the first thing that we need to do is get and prepare our data set for which we'll be training our model. We're going to work with the data set from the kaggle cats versus dogs competition. And you can find a link to download the dataset in the corresponding blog for this episode on people's are.com. So we're mostly going to organize our data on this programmatically. But there's a couple of manual steps that we'll go through first. So after you've downloaded the data set from kaggle, you will have this zip folder here. And if we look inside, then this is the contents we have a zipped train folder, and a zipped test folder along with the sample submission CSV. So we actually are not going to be working with this, or this. So you can delete the test that as well as the CSV file, we're going to only be working with this train zip. So we want to extract the high level cats, or dogs versus cats at first, and then extract the train zip. So because that takes a while I went ahead and did that here. So now I have just this train directory because I moved the test directory elsewhere and deleted the CSV file. So now I have this extracted train folder. And I go in here and I have a nested train folder. As you can see, I'm already in train once. Now I'm in train again. And in here we have all of these images of cats, and dogs. Okay, so that is the way that the data will come downloaded. The first step, or the next step that we need to do now is to come into here and grab all of these images. And we're going to Ctrl x or cut all of these and bring them up to the first train directory. So we don't want this nested directory structure. Instead, we're going to place them directly within the first train directory. Alright, now all of our images have been copied into our base train directory here. So this nested train directory that the images previously belong to is now empty. So we can just go ahead and delete this one. Alright, so now our directory structure is just this dogs vs. Cat top directory within we have a train directory. And then we have all of the images within our train directory. Both of cats and dogs. The last step is to just move the dogs versus cats directory that has all the data in it to be in the place on disk where you're going to be working. So for me, I am in relative to my Jupyter Notebook, I am within a directory called data and I have placed dogs versus cats here. Alright, so that's it for the manual labor now everything else that will do to organize the data, and then later process the data will be programmatically through code. Alright, so we're now here within our Jupyter Notebook. And First things first, we need to import all of the packages that we'll be making use of and all of the packages here are not just This specific episode on processing the image data. But actually, these are all of the packages that we'll be making use of over the next several episodes as we're working with cn ns. Alright, so we'll get that taken care of. And now, just this cell here is making sure if we are using a GPU that TensorFlow is able to identify it correctly, and we are an enabling memory growth on the GPU as well. If you're not using a GPU, then no worries, as mentioned earlier, you are completely fine to follow this course with a CPU only. Alright, so now we're going to pick back up with organizing our data on disk. So assuming that we've gone through the first steps from the beginning of this episode, we're now going to organize the data and to train valid and test ders, which correspond to our training validation and test sets. So this group here is going to first change directories into our dog vs. Cat directory. And then it's going to check to make sure that the directory structure that we're about to make does not already exist. And if it doesn't, it's going to proceed with the rest of this script. So the first thing that it's doing as long as the directory structure is not already in place, is making the following directories. So we already have a train directory. So it's going to make a nested dog and cat directory within train. And then additionally is going to make a valid directory that contains dog and cat directories, and a test directory, which also contains dog and cat directories. Now, this particular data set contains 25,000 images of dogs and cats. And that is pretty much overkill for the tasks that we will be using these images for. In the upcoming episodes, we're actually only going to use a small subset of this data, you're free to work with all the data if you'd like. But it would take a lot longer to train our networks and work with the data in general if we were using the entire set of it. So we are going to be working with a subset consisting of 1000 images in our training set 200 in our validation set, and 100 in our test set, and each of those sets are going to be split evenly among cats and dogs. So that's exactly what this block of code here is doing. It's going into the images that are in our dogs vs cat directory, and moving 500 randomly moving 500 cat images into our train cat directory 500 dog images into our train dog directory. And then similarly doing the same thing for our valid, our valid our validation set, both for cat and dogs and then our test set, both for cat and dogs with just these quantities differing regarding the amounts that I stated earlier for each of the sets. And we're able to understand which images are cats and which are dogs based on the names of the files. So if you saw earlier, the cat images actually had the word cat in the file names and then the dog images had the word dog in the file names. So that's how we're able to select dog images and cat images here with this script. Alright, so after this script runs, we can pull up our file explorer and look at the directory structure and make sure it is what we expect. So we have our dogs vs cat directory within our data directory here. So if we enter, then we have test train and valid directories. Inside test, we have cat that has cat images, and inside dog, we have dog images. If we back out and go to train, we can see similarly. And if we go into valid we can see similarly. And you can select one of the folders and look at the properties to to see how many files exist within the directory to make sure that is it is the amount that we chose to put in from our script and that we didn't accidentally make any type of error. So if we go back to the dogs vs cats the root directory here, you can see we have all of these cat and dog images leftover. These were the remaining 23,000 or so that were left over after we moved our subset into our train valid and test directories. So you're free to make use of these in any way that you want or delete them or move them to another location. All of what we'll be working with are in these three directories here. Alright, so at this point, we have obtained the data and we have organized the data. Now it's time to move on to processing the data. So if we scroll down, first, we are just creating these variables here where we have assigned our train valid and test paths. So this is just pointing to the location on disk where our different data sets reside. Now recall earlier in the course, we talked about that whenever we train a model that we need to put the data into a format that the model expects. And we know that when we train a karass sequential model, that the model receives the data whenever we call the fit function. So we are going to put our images and do the format of a karass generator. And we're doing that in this cell here. We're creating these train valid and test batches, and setting them equal to image data generator dot flow from directory, which is going to return a directory iterator. Basically, it's going to create batches of data from the directories where our datasets reside. And these batches of data will be able to be passed to the sequential model using the fit function. So now let's look exactly at how we are defining these variables. So let's focus just on train batches for now. So we're setting train batches equal to image data generator dot flow from directory. But first to image data generator, we are specifying this pre processing function and setting that equal to tf Kerris dot applications that VGG 16 dot pre process input. So I'm just going to tell you for now that this is a function that is going to apply some type of pre processing on the images before they get passed to the network that we'll be using. And we're processing them in such a way that is equivalent to the way that a very popular model known as VGG 16, we're processing our images in the same format as which images that get passed to this VGG 16 model our process. And we're going to talk about more about this in a future episode. So don't let it confuse you now just know that this is causing some type of processing to occur on our images. And we'll talk more about it in a future episode and not stress on it now, because it's not necessarily very important for us right at this moment, the technical details of that at least. So besides that, when we call flow from directory, this is where we are passing in our actual data and specifying how we want this data to be processed. So we are setting directory equal to train path, which appear we defined the location on disk where our training set was under the train PATH variable. And then we're setting target size equal to 224 by 224. So this is the height and width that we want the cat and dog images to be resized to. So if you're working with an image data set that has images of varying sizes, or you just want to scale them up or scale them down, this is how you can specify that to happen. And this will resize all images in your data set to be of this height and width before passing them to our network. Now we are specifying our classes, which are just the classes for the potential labels of our data set. So cat or dog, and we are setting our batch size to 10. We do the exact same thing for the validation set. And the test set. Everything is the exact same for both of them, except for where each of these sets live on disk as being specified here under the directory parameter. And then the only other difference is here for our test batches. We are specifying this shuffle equals false parameter. Now, this is because whenever we use our test batches later for inference to get our model to predict on images of cats and dogs after training and validation has been completed, we're going to want to look at our prediction results in a confusion matrix like we did in a previous video for a separate data set. And in order to do that, we need to be able to access the unsettled labels for our test set. So that's why we set shuffle equals false for only this set. For both validation and training sets, we do want the data sets to be shuffled. Alright, so we run this and we get the output of found 1000 images belonging to two classes. And that is corresponding to our train batches found at 200 images belonging to two classes, which corresponds to valid valid batches, and then the 100 belonging to two classes corresponding to our test batches. So that is the output that you want to see for yourself. That's letting you know that it found the images on disk that belong to both the cat and dog classes that you have specified here. So if you are not getting this at this point, if you get found at zero images, then perhaps you're pointing to the wrong place on disk. You just need to make sure that it's able to find all the images that you set up previously. Right and here we are just verifying that that is indeed the case. Now Next, we are going to just grab a single batch of images and the corresponding labels from our train batches. And remember, our batch size is 10. So this should be 10 images along with the 10 corresponding labels. Next, we're introducing this function plot images that we're going to use to plot the images from our train batches that we just obtained above. And this function is directly from tensor flows website. So check the link in the corresponding blog for this episode on people's or.com. To see, to be able to get to the tensor flows site where exactly I pulled this off of. So we will define this function here. Alright, so now we're just going to use this function to plot our images from our test batches here. And we're going to print the corresponding labels for those images. So if we scroll down, we can see this is what a batch of training data looks like. So this might be a little bit different than what you expected, given the fact that it looks like the color data has been a little bit distorted. And that's due to the pre processing function that we called to pre process the images in such a way that in the same type of way that images get pre processed for the famous VGG 16 model. So like I said, we're going to discuss in detail what exactly that pre processing function is doing technically, as well as why we're using it in a later video. But for now, just know that it's skewing the RGB data in some way. So we can still make out the fact that this is a cat. And this looks like a cat. This is the dog, dog, dog, dog cat. Yeah, so we can still kind of generally make out what these images are, but the color data is skewed. But don't worry too much about the technical details behind that. For right now, just know that this is what the data looks like before we pass it to the model. And here are the corresponding labels for the data. So we have these one hot encoded vectors that represent either cat or dog. So a one zero represents a cat, and a zero, a one represents a dog. Okay, so I guess I was wrong earlier with thinking that this one was a dog. This one is a cat, because as we can see, it maps to the 101 hot encoding. And if you don't know what I mean by one hot encoding, then check out the corresponding video for that in the deep learning fundamentals course on depot's or.com. But yeah, we can see that 01 is the vector used to represent the label of a dog. So this one is a dog. And the next two are dogs as well, this one and this one. Now, just a quick note about everything that we've discussed up to this point, sometimes we do not have the corresponding labels for our test set. So in the examples that we've done so far, in this course, we've always had the corresponding labels for our test set. But in practice, a lot of times you may not have those labels. And in fact, if we were to have used the downloaded test directory that came from the kaggle download, then we would see that that test directory does not have the images labeled with the cat or dog. So in this case, we do have the test labels for the cat and dog images since we pulled them from the original training set from kaggle that did have the corresponding labels. But if you don't have access to the test labels, and you are wondering how to process your test data accordingly, then check the blog for this episode on dibbles comm I have a section there that demonstrates what you need to do differently from what we showed in this video if you do not have access to the labels for your test set. Alright, so now we have obtained our image data, organized it on disk and processed it accordingly for our convolutional neural network. So now in the next episode, we are going to get set up to start building and training our first CNN. Be sure to check out the blog and other resources available for this episode on the poser.com as well as the deep lizard hive mind where you can gain access to exclusive perks and rewards. Thanks for contributing to collective intelligence. Now let's move on to the next episode. Hey, I'm Andy from deep lizard. And this episode will demonstrate how to build a convolutional neural network and then train it on images of cats and dogs using tensor flows integrated Kerris API. We'll be continuing to work with the cat and dog image data that we created in the last episode. So make sure you still have all of that in place, as well as the imports that we brought in in the last episode as well as we'll be making use of those imports. For the next several videos of working with CNN, so to create our first CNN, we will be making use of the Kerris sequential model. And recall, we introduced this model in an earlier episode when we were working with just plain simple numerical data. But we'll continue to work with this model here with our first CNN. So the first layer to this model that we pass is a comma to D layer. So this is just our standard convolutional layer that will accept image data. And to this layer, we're arbitrarily setting the filter value equal to 32. So this first convolutional layer will have 32 filters with a kernel size of three by three. So the choice of 32 is pretty arbitrary, but the kernel size of three by three is a very common choice for image data. Now, this first column to D layer will be followed by the popular relu activation function. And we are specifying padding equals Same here. And this just means that our images will have zero padding, padding the outside so that the dimensionality of the images aren't reduced after the convolution operations. So lastly, for our first layer only, we specify the input shape of the data. And recall we touched on this parameter previously, you can think of this as kind of creating an implicit input layer for our model. This comp 2d is actually our first hidden layer, the input layer is made up of the input data itself. And so we just need to tell the model the shape of the input data, which in our case is going to be 224. By 224. Recall, we saw that we were setting that target size parameter to do 24 by 224. Whenever we created our valid test and train directory iterators, we said that we wanted our images to be of this height and width. And then this three here is regarding the color channels. Since these. Since these images are an RGB format, we have three color channels, so we specified this input shape. So then we follow our first convolutional layer with a max pooling layer, where we're setting our pool size to two by two and our strides by two. And if you are familiar with Max pooling, then you know this is going to cut our image dimensions in half. So if you need to know more about max pooling, or you just need a refresher, same thing with padding here for zero padding, activation functions, anything like that, then be sure to check those episodes out and the corresponding deep learning fundamentals course on depot's or.com. So after this max pooling layer, we're then adding a another convolutional layer, that looks pretty much exactly the same as the first one except for four, we're not including the input shape parameter, since we only specify that for our first hidden layer, and we are specifying the filters to be 64 here instead of 32. So 64 is again, an arbitrary choice, I just chose that number here. But the general rule of increasing functions as you go until later layers of the network is common practice, we then follow this second convolutional layer with another max pooling layer identical to the first one, then we flatten all of this into a one dimensional tensor before passing it to our dense output layer, which only has two nodes corresponding to cat and dog. And our output layer is being followed by the softmax after activation function, which as you know, is going to give us probabilities for each corresponding output from the model. Alright, so if we run this, we can then check out the summary of our model. And this is what we have exactly what we built, along with some additional details about the learnable parameters and the output shape of the network. So these are things that are also covered in the in the deep learning fundamentals course as well, if you want to check out more information about learnable parameters and such. So now that our model is built, we can prepare it for training by calling model dot compile, which again, we have already used in a previous episode, when we were just training on numerical data. But as you will see here, it looks pretty much the same as before, we are setting our optimizer equal to the atom optimizer with a learning rate of 0.0001. And we are using categorical cross entropy. And we are looking at the accuracy as our metrics to be able to judge the model performance. Just a quick note before moving on. We are using categorical cross entropy. But since we have just two outputs from our model, either cat or dog, then it is possible to instead use binary cross entropy. And if we did that, then we would need to just have a one single output node from our model instead of two And then rather than following our output layer with the softmax activation function, we would need to follow it with sigmoid. And both of these approaches using categorical cross entropy loss with the setup that we have here, or using binary cross entropy loss with the setup that I just described, work equally well, they're totally equivalent, and will yield the same results, categorical cross entropy, and using softmax on the app, as the activation function for the output layer, or just a common approach for when you have more than two classes. So I like to continue using that approach, even when I only have two classes just because it's general. And it's the case that we're going to use whenever there's more than two classes anyway, so I just like to stick with using the same type of setup, even when we only have two outputs. Alright, so after compiling our model, we can now train it using model dot fit, which we should be very familiar with up to this point. So to the fit function, we are first specifying our training data, which is stored in train batches. And then we specify our validation data, which is stored in valid batches. Recall, this is a different way of creating validation data, as we spoke about in an earlier episode, we're not using validation split here, because we've actually created a validation set separately ourselves before fitting the model. So we are specifying that separated set here as the validation data parameter, then we are setting our epochs equal to 10. So we're only going to train for 10 runs this time and setting verbose, verbose equal to two so that we can see the most verbose output during training. And one other thing to mention is that you will see here that we are specifying x just as we have in the past, but we are not specifying y, which is our target data usually. And that's because when data is stored as a generator, as we have here, the the generator itself actually contains the corresponding labels. So we do not need to specify them separately, whenever we call fit, because they're actually contained within the generator itself. So let's go ahead and run this now. Alright, so the model just finished training. So let's check out the results. Before we do just if you get this warning, it appears from the research I've done to be a bug within TensorFlow that's supposed to be fixed in a in the next release actually, is what I read. So you can just safely ignore this warning, it has no impact on our training. But if we scroll down and look at these results, then we can see that by our 10th epoch, our accuracy on our training set has reached 100%. So that is great. But our validation accuracy is here at 69%. So not so great, we definitely see that we have some overfitting going on here. So if this was a model that we really cared about, and that we really wanted to use and be able to deploy to production, then in this scenario, we will need to stop what we're doing and combat this overfitting problem before going further. But what we are going to do is that we'll be seeing in an upcoming episode, how we can use a pre trained model to perform really well on this data. And that'll get us exposed to the concept of fine tuning. Before we do that, though, we are going to see in the next episode how this model holds up to inference at predicting on images in our test set. Be sure to check out the blog and other resources available for this episode on the poser.com as well as the deep lizard hive mind where you can gain access to exclusive perks and rewards. Thanks for contributing to collective intelligence. Now let's move on to the next episode. Hey, I may be from deep lizard. And this episode will demonstrate how to use a convolutional neural network for inference to predict on image data using tensor flows integrated keras API. Last time, we built and trained our first cnn against cat and dog image data, and we saw that the training results were great with the model achieving 100% accuracy on the training set. However, it lagged behind by quite a good bit at only 70% accuracy on our validation set. So that tells us that the model wasn't generalizing as well as we hoped. But nonetheless, we are going to use our model now for inference to predict on cat and dog images in our test set. Given the less than decent results that we saw from the validation performance. Our expectation is that the model is not going to do so well on the test set either it's probably going to perform at around the same 70% rate. But this is still going to give us exposure about how we can use a CNN for inference using the Kerris sequential API. Alright, so we are back in Jupyter Notebook and we need to make sure that we have all the code in place from the last couple of episodes as we will be continuing to make use of both our model that we built last time, as well as our test data from whenever we prepared the data sets. So the first thing we're going to do is get a batch of test data from our test batches. And then we're going to plot that batch, we're going to plot the images specifically, and then we're going to print out the corresponding labels for those images. And we're using before we do that just a reminder, this plot images function that we introduced in the last couple of episodes. Alright, so if we scroll down, we have our test batches. Recall, we had the discussion about why the image data looks the way that it does in terms of the color being skewed last time, but we can see that just by looking even though we have kind of distorted color we have, these are all actually cats here. And by looking at the corresponding label for these images, we can see that they are all labeled with the one hot encoded vector of one zero, which we know is the label for cat. So if you're wondering why we have all cats as our first 10 images in our first batch here, that is because we're call whenever we created the test set, we specified that we did not want it to be shuffled. And that was so that we could do the following. If we come to our next cell, and we run test batches classes, then we can get a array that has all of the corresponding labels for each image in the test set. So given that we have access to the ns shuffled labels for the test set, that's why we don't want to shuffle the test set directly, because we want to be able to have this one one direct mapping from the uncoupled labels to the test data set. And if we were to shuffle the test data set, every time we generated a batch, then we wouldn't be able to have the correct mapping between labels and samples. So we care about having the correct mapping, because later after we get our predictions from the model, we're going to want to plot our predictions to a confusion matrix. And so we want the corresponding labels that belong to the samples in the test set. Alright, so next, we're actually going to go ahead and obtain our predictions by calling model dot predict just as we have an earlier episodes for other data sets. And to x, we are specifying our test batches, so all of our test data set, and we are choosing verbose to be zero to get no output whenever we run our predictions. So here, we are just printing out the arounded predictions from the model. So the way that we can read this is first, each one of these arrays is a prediction for a single sample. So if we just look at the first one, this is the prediction for the first sample and the test set. So this one, wherever there is a one with each prediction is the one is the index for the output class that had the highest probability from the model. So in this case, we see that the zeroeth index had the highest probability. So we can just say that the label that the model predicted for this first sample was a zero, because we see that there is a one here in the zeroeth index. And if we look at the first element, or if we look at the first label here, up here, for our first test sample, it is indeed zero. So we can eyeball that and see that the model did accurately predict all the way down to here, because that's the first 123456 and 123456. Okay, so But then, whenever we see that the model predicted the first index to be the highest probability, that means that the, that the model predicted an output label of a one, and so that corresponds to dog. So it's hard for us to kind of draw an overall conclusion about the prediction accuracy for this test set, just eyeballing the results like this. But if we scroll down, then we know that we have the tool of a confusion matrix that we can use to make visualizing these results much easier, like we've seen in previous episodes of this course already. So we are going to do that. Now we're going to create a confusion matrix using this confusion matrix function from scikit learn which we've already been introduced to and we are passing in our true labels using test batches classes. Recall that we just touched on that a few minute ago. And for our predictive labels, we are passing in the predictions from our model, we are getting the we're actually passing in the index of each. We're actually using arg max to pass in the index of where the most probable prediction was from our predictions list. So this is something that we've already covered in previous episodes for why we do that. So if we run that, we're now going to bring in this plot confusion matrix, which we've discussed is directly from psychic Lauren's website, link to that is, and the corresponding blog for this episode on people's are calm. This is just going to allow us to plot our confusion matrix in a moment. And now if we look at the class indices, we see that cat is first and dog is second. So we just need to look at that so that we understand in which order, we should put our plot labels for our confusion matrix. And next we call plot confusion matrix and pass in the confusion matrix itself, as well as the labels for the confusion matrix and a title for the entire matrix. So let's check that out. Alright, so from what we learned about how we can easily interpret a confusion matrix, we know that we can just look at this diagonal here running from top left to bottom right, to see what the model predicted correctly. So not that great, the model is definitely overfitting at this point. So like I said, if this was a model that we were really concerned about, then we would definitely want to combat that overfitting problem. But for now, we are going to move on to a new model using a pre trained state of the art model called VGG 16. In the next episode, so that we can see how well that model does on classifying images of cats and dogs. Be sure to check out the blog and other resources available for this episode on the poser.com as well as the deep lizard hive mind where you can gain access to exclusive perks and rewards. Thanks for contributing to collective intelligence. Now let's move on to the next episode. Hey, I'm Andy from be blizzard. And this episode will demonstrate how we can fine tune a pre train model to classify images using tensor flows keras API. The pre train model that we'll be working with is called VGG 16. And this is the model that won the 2014 image net competition. In the image net competition, multiple teams compete against each other to build a model that best classifies images within the image net library. And the image net library is made up of 1000s of images that belong to 1000 different classes using Kerris will import this VGG 16 model, and then fine tune it to not classify on one of the 1000 categories for which it was originally trained, but instead only on two categories, cat and dog. Note, however, that cats and dogs were included in the original image net library for which VGG 16 was trained on. And because of this, we won't have to do much tuning to change the model from classifying from 1000 classes to just the two cat and dog classes. So the overall fine tuning that we'll do will be very minimal. in later episodes, though, we'll do more involved fine tuning and use transfer learning to transfer what a model has learned on an original data set to completely new data and a new custom data set that we'll be using later. To understand fine tuning and transfer learning at a fundamental level, check out the corresponding episode for fine tuning in the deep learning fundamentals course on V blizzard.com. before we actually start building our model, let's quickly talk about the VGG 16 pre processing that is done. Now recall we are here in our Jupyter Notebook. This is from last time when we plotted a batch from our test data set. And a few episodes ago, we discussed the fact that this color data was skewed as a result of the VGG 16 pre processing function that we were calling. Well, now we're going to discuss what exactly this pre processing does. So as we can see, we can still make out the images as these here being all cats, but it's just the data the color data itself that appears to be distorted. So if we look at the paper that the authors of VGG 16 detailed the model in under the architecture section, let's blow this up a bit, we can see that they state that the only pre processing we do is subtract the mean RGB value computing on the trait computed on the training set from each pixel. So what does that mean? That means that they computed the mean red value pixel for All of the training data. And then once I had that mean value across each image in the training set, they subtracted that mean value from the red value, the red pixel, each pixel in the image. And then they did the same thing for the green and blue pixel values as well. So they found the green picks the Mean Green pixel value among the entire training set. And then for each sample in the training set, every green pixel, they subtracted that green value from it. So and same thing for the blue, of course. So that's what they did for the pre processing. So given that, that's how VGG 16 was originally trained. That means now whenever new data is passed to the model, that it needs to be processed in the same exact way as the original training set. So Kerris already has functions built in for popular models, like VGG 16, where they have that pre processing in place that matches for the corresponding model. So that's why we were calling that model whenever we process our cat and dog images earlier, so that we could go ahead and get those images processed in such a way that matched how the original training set was processed, when VGG was originally trained. Alright, so that is what that color distortion is all about. So now let's jump back into the code. Now that we know what that's about, we have an understanding of the pre processing. Now, Let's now get back to actually building the fine tuned model. So the first thing that we need to do is download the model. And when you call this for the first time, you will need an internet connection because it's going to be downloading the model from the internet. But after this, subsequent calls to this function will just be grabbing this model from the download a copy on your machine. Alright, so the model has been downloaded it now we are just running this summary. And we can see that this model is much more complex than what we have worked with up to this point. So total, there are almost 140 million parameters in this model. And on disk, it is over 500 megabytes. So it is quite large. Now recall I said that this VGG 16 model originally was predicting for 1000 different image net classes. so here we can see our output layer of the VGG 16 model has a 1000 different outputs. So our objective is going to simply be to change this last output layer to predict only two output classes corresponding to cat and dog, along with a couple other details regarding the fine tuning process that we'll get to in just a moment. So now, we're actually going to just skip these two cells here. These are just for me to be able to make sure that I've imported the model correctly. But it is not relevant for any code here. It's just checking that the trainable parameters. And non trainable parameters are what I expect them to be after importing. But this is not of concern at the moment for us. So if we scroll down here, we're now going to build a new model called sequential. Alright, now before we run this code, we're actually just going to look at the type of model this is. So this is returning a model called model. So this is actually a model from the Kerris functional API. We have been previously working with sequential models. So we are in a later episode going to discuss the functional API in depth, it's a bit more complicated and more sophisticated than the sequential model. For now, since we're not ready to bring in the functional model yet, we are going to convert the original VGG 16 model into a sequential model. And we're going to do that by creating a new variable called model here and setting this equal to an instance of sequential object. And we are going to then loop through every layer and VGG 16. Except for the last output layer, we're leaving that out, we're going to loop through every layer and then add each layer into our new sequential model. So now we'll look at a summary of our new model. And by looking at this summary, if you take the time to compare the previous summary to this summary, what you will notice is that they are exactly the same except for the last layer has been not included in this new model. So this layer here we have this fully connected two layer is what this is here, with the output shape of 4096. here if we scroll Back up, we can see that this is this layer here. So the predictions layer has not been included, because when we were iterating over our for loop, we went all the way up to the second last layer, we did not include the last layer of VGG 16. Alright, so now let's scroll back down. And we're now going to iterate over all of the layers within our new sequential model. And we are going to set each of these layers to not be trainable by setting layer dot trainable to false. And what this is going to do is going to freeze the trainable parameters, or the weights and biases from all the layers in the model so that they're not going to be retrained. Whenever we go through the training process for cats and dogs, because VGG 16 has already learned the features of cats and dogs and its original training, we don't want it to have to go through more training again it since it's already learned those features. So that's why we are freezing the weights here, we're now going to add our own output layer to this model. So remember, we removed the previous output layer that had 10 output or that had 1000 output classes rather. And now we are going to add our own output layer that has only two output classes for cat and dog. So we add that now since we have set all of the previous layers to not be trainable, we can see that actually only our last output layer that's going to be the only trainable layer in the entire model. And like I said before, that's because we already know that VGG 16 has already learned the features of cats and dogs during its original training. So we only need to retrain this output layer to classify two output classes. So now if we look at a new summary of our model, then we'll see that everything is the same, except for now we have this new dense layer as our output layer, which only has two classes, instead of 1000. From the original VGG 16 model, we can also see that our model now only has 8000 trainable parameters, and those are all within our output layer. As I said that our output layer is our only trainable layer. So before actually, all of our layers were trainable. If we go take a look at our original VGG 16 model, we see that we have 138 million total parameters, all of which are trainable, none of which are non trainable. So if we didn't freeze those layers, then they would be getting retrained during the training process for our cat and dog images. So just to scroll back down again and check this out. We can see that now we still have quite a bit of learnable parameters or a total parameters 134 million, but only 8000 of which are trainable. The rest are non trainable. In the next episode, we'll see how we can train this modified model on our images of cats and dogs. Be sure to check out the blog and other resources available for this episode on the bowser.com as well as the deep lizard hive mind where you can gain access to exclusive perks and rewards. Thanks for contributing to collective intelligence. Now let's move on to the next episode. Hey, I'm Mandy from Blizzard. In this episode, we'll demonstrate how to train the fine tuned VGG 16 model that we built last time on our own data set of cats and dogs. Alright, so we're jumping straight into the code to train our model. But of course, be sure that you already have the code in place from last time as we will be building on the code that we have already run previously. So, we are using our model here to first compile it to get it ready for training. This model is the sequential model that we built last time that is our fine tuned VGG 16 model containing all the same layers with frozen weights except for our last layer which we have modified to output only two possible outputs. So, we are compiling this model using the atom optimizer as we have previously used with the learning rate of 0.0001 and a the loss we are using again categorical cross entropy just like we have before and we are using the accuracy as our only metric to judge model performance. So, there is nothing new here with this call to compile. It is pretty much exactly the same as what we have seen for our previous models. Now we are going to train the model using model dot fit and we are passing in our training data set which we have stored in train batches. We are passing in our validation set Which we have stored as valid batches. And we are only going to run this model for five epochs. And we are setting the verbosity level to two so that we can see the most comprehensive output from the model during training. So let's see what happens. Alright, so training has just finished, and we have some pretty outstanding results. So just after five epochs on our training data and validation data, we have an accuracy on our training data of 99%. And validation accuracy, right on par at 98%. So that's just after five epochs. And if we look at the first epoch, even our first epoch gives, it gives us a training accuracy of 85%, just starting out, and a validation accuracy of 93%. So this isn't totally surprising, because remember, earlier in the course, we discussed how VGG 16 had already been trained on images of cats and dogs from the image net library. So it had already learned those features. Now, the flight training that we're doing on the output layer, is just to train VGG 16 to output only cat or dog at classes. And so it's really not surprising that it's doing such a good job right off the bat in its first epoch, and even an even considerably better, and it's fifth epoch at 99% training accuracy. Now we're called the previous CNN that we built from scratch ourselves the really simple convolutional neural network, that model actually did really well on the training data, reaching 100% accuracy after a small amount of epochs as well. Where we saw it lagging though was with the validation accuracy. So it had a validation accuracy of around 70%. Here we see that we are at 98%. So the main recognizable difference between our very simple CNN and this VGG 16 fine tuned model is how well this model generalizes to our cat and dog data in the validation set, whereas the model we built from scratch did not generalize so well on data that was not included in the training set. In the next episode, we're going to use this VGG 16 model for inference to predict on the cat and dog images in our test set. And given the accuracy that we are seeing on the validation set here, we should expect to see some really good results on our test set as well. Be sure to check out the blog and other resources available for this episode on the poser.com as well as the deep lizard hive mind where you can gain access to exclusive perks and rewards. Thanks for contributing to collective intelligence. Now let's move on to the next episode. Hey, I'm Andy from deep lizard. In this episode, we'll use our fine tuned VGG 16 model for inference to predict on images in our test set. All right, we are jumping right back into our Jupyter Notebook. Again, making sure that all the code is in place and has run it from the previous episodes as we will be building on the code that has been run there. So first things first is that we are going to be using the model to get predictions from our test set. So to do that we call model dot predict which we have been exposed to in the past. Recall that this model here is our fine tuned VGG 16 model. And to predict we are passing our test set which we have stored in test batches. And we are setting our verbosity level to zero as we do not want to see any output any output from our predictions. Now recall Previously, we talked about how we did not shuffle the test set for our cat and dog image data set. And that is because we want to be able to access the classes here and uncheck old order so that we can then pass in the uncheck old classes that correspond to the old sorry, the shuffled labels that correspond to the ns shuffled test data, we want to be able to have those in a one to one mapping, where the labels actually are the correct and shuffled labels for the unsettled data samples, we want to be able to pass those to our confusion matrix. So this is the same story as we saw whenever we were using our CNN that we built from scratch a few episodes back, we did the same process, we're using the same dataset recall. And so now we are plotting this confusion matrix in the exact same manner as before. So this is actually the exact same line that we used to plot the confusion matrix A few episodes back when we plotted it on this same exact test set for the CN n that we built from scratch. And just a reminder from last time recall that we looked at the class indices of the test batches. So that we could get the correct order of our cat and dog classes to use for our labels for our confusion matrix. So we are again doing the same thing here. And now we are calling the psychic learn plot confusion matrix which you should have defined earlier in your notebook from a previous episode. And to plot confusion matrix we are passing in our confusion matrix, and our labels defined just above as well as a general title for the confusion matrix itself. So now, let's check out the plot so that we can see how well our model did on these predictions. So this is what the third or fourth time that we've used a confusion matrix in this course so far, so you should be pretty normalized to how to read this data. So recall, the quick and easy way is to look from the top left to the bottom right along this diagonal, and we can get a quick overview of how well the model did. So the model correctly predicted a dog for 49 times for images that were truly dogs. And it correctly predicted a cat 47 times for images that truly were cats. So we can see that one time it predicted a cat when it was actually a dog. And three times it predicted a dog when images were actually cats. So overall, the model incorrectly predicted four samples, so that gives us 96 out of 100. Correct. Or let's see 96 correct predictions out of 100 total predictions. So that gives us an accuracy rate on our test set of 96%. Not surprising given what we saw in the last episode for the high level of accuracy that our model had on the validation set. So overall, this fine tuned VGG 16 model does really well at generalizing on data that it had not seen during training a lot better than our original model for which we build from scratch. Now recall that we previously discussed that the overall fine tuning approach that we took to this model was pretty minimal since cat and dog data was already included in the original training set for the original VGG 16 model. But in upcoming episodes, we are going to be doing more fine tuning, more fine tuning than what we saw here for VGG 16. As we will be fine tuning another well known pre trained model, but this time for a completely new data set that was not included in the original data set that it was trained on. So stay tuned for that. Be sure to check out the blog and other resources available for this episode on the poser.com as well as the deep lizard hive mind where you can gain access to exclusive perks and rewards. Thanks for contributing to collective intelligence. Now let's move on to the next episode. Hey, I'm Andy from deep lizard. And this episode we'll introduce mobile nets a class of a lightweight deep convolutional neural networks that are much smaller and faster and size than many of the mainstream popular models that are really well known. neural nets are a class of small, low power, low latency models that can be used for things like classification, detection, and other things that cn ns are typically good for. And because of their small size, these models are considered great for mobile devices, hence the name mobile nets. So I have some stats taken down here. So just to give a quick comparison, in regards to the size, the size of the full VGG 16 network that we've worked with in the past few episodes is about 553 megabytes on disk. So pretty large, generally speaking, the size of one of the currently largest mobile nets is only about 17 megabytes. So that's a pretty huge difference, especially when you think about deploying a model to run on a mobile app, for example, this vast size difference is due to the number of parameters or weights and biases contained in the model. So for example, let's see VGG 16, as we saw previously has about 138 million total parameters. So a lot. And the 17 megabytes mobile net that we talked about, which was the largest mobile net currently has only about 4.2 million parameters. So that is much much smaller on a relative scale than VGG 16 with 138 million. Aside from the size on disk being a consideration when it comes to comparing mobile nets to other larger models. We also need to consider the memory as well. So the more parameters that a model has The more space in memory it will be taking up also. So while mobile nets are faster and smaller than the competitors of these big hefty models like VGG 16, there is a catch or a trade off, and that trade off is accuracy. So, mobile nets are not as accurate as some of these big players like VGG 16, for example, but don't let that discourage you. While it is true that mobile nets aren't as accurate as these resource heavy models like VGG 16, for example, the trade off is actually pretty small, with only a relatively small reduction in accuracy. And in the corresponding blog for this episode, I have a link to a paper that goes more in depth to this relatively small accuracy difference. If you'd like to check that out further, let's now see how we can work with mobile nets and code with Kerris. Alright, so we are in our Jupyter Notebook. And the first thing we need to do is import all of the packages that we will be making use of which these are not only for this video, but for the next several videos where we will be covering mobilenet. And as mentioned earlier in this course, a GPU is not required. But if you're running a GPU, then you want to run this cell. This is the same cell that we've seen a couple of times already in this course earlier, where we are just making sure that TensorFlow can identify our GPU if we're running one, and it is setting the memory growth to true if we do have a GPU. So again, don't worry, if you don't have a GPU, don't worry. But if you do, then run this cell, similar to how we downloaded the VGG 16 model, when we were working with it in previous episodes, we take that same approach to download mobile net here. So we call TF dot cares applications dot mobile net dot mobile net, and that the first time we call it is going to download mobile net from the internet. So you need an internet connection. But subsequent calls to this are just going to be getting the model from a saved model on disk and loading it into memory here. So we are going to do that now and assign that to this mobile variable. Now mobile net was originally trained on the image net library, just like VGG 16. So in a few minutes, we will be passing some images to mobile net that I've saved on this. They are not images from the image net library, but they are images of some general things. And we're just going to get an idea about how mobile net performs on these random images. But first, in order to be able to pass these images to mobile net, we're going to have to do a little bit of processing first. So I've created this function called prepare image. And what it does is it takes a file name, and it then inside the function, we have an image path, which is pointing to the location on disk, where I have these saved image files that we're going to use to get predictions from mobile net. So we have this image path defined to where these images are saved, we then load the image by using the image path, and appending, the file name that we pass in. So say if I pass in image, one dot png here, then we're going to take that file path, append one PNG here, pass that to load image, and pass this target size of 224 by 224. Now, this load image function is from the keras API. So what we are doing here is we are just taking the image file, resizing it to be of size 224 by 224, because that is the size of images that mobile net expects, and then we just take this image and transfer it to be in a format of an array, then we expand the dimensions of this image, because that's going to put the image in the shape that mobilenet expects. And then finally, we pass this new processed image to our last function, which is TF Kerris, that application stop mobile net, that pre process input. So this is a similar function to what we saw a few episodes back when we were working with VGG 16, it had its own pre process input. Now mobile net has its own pre process input function, which is processing images in a way that mobile net expects, so it's not the same way as VGG 16. Actually, it's just scaling all the RGB pixel values to be on a scale instead of from zero to 255, to be on a scale from minus one to one. So that's what this function is doing. So overall, this entire function is just resizing the image and putting it into an array format with expanded dimensions and then mobile net, processing it and then returning this processed image. Okay, so that's kind of a mouthful, but that's what we got to do two images before we pass them to mobile net. Alright, so we will just Define that function. And now we are going to display our first image called one dot png, from our mobile net samples directory that I told you, I just set up with a few random images. And we're going to plot that to our Jupyter Notebook here. And what do you know, it's a lizard. So that is our first image. Now, we are going to pass this image to our prepare image function that we defined right above that we just finished talking about. So we're going to pass that to the function to pre process the image accordingly, then we are going to pass the pre processed image returned by the function to our mobile net model, we're going to do that by calling predict on the model, just like we've done in previous videos, when we've called predict on models to use them for inference, then after we get the prediction for this particular image, we are then going to give this prediction to this image net utils dot decode predictions function. So this is a function from cares, that is just going to return the top five predictions from the 1000 possible image net classes. And it's going to tell us the top five that mobile net is predicting for this image. So let's run that and then print out those results. And maybe you can have a better idea of what I mean once you see the printed output. So we run this, and we have our output. So these are the top five in order results from the imagenet classes that mobile net is predicting for this image. So it's assigning a 58% probability to this image of being a an American chameleon 28% probability to green lizard 13% to a gamma. And then we have some small percentages here under 1% for these other two types of lizards. So it turns out, if you're not aware, and I don't know how you could not be aware of this, because everyone should know this. But this is an American chameleon. And I don't know I've always called these things green and OLS. But I looked it up. And they're also known as American chameleon. So mobilenet got it right. So yeah, it assigned at 58% probability that was the highest, most probable class. Next was green lizard. So I'd say that that is still really good for that to be your second place. I don't know if green lizard is supposed to be more general. But and then a gamma, which is also a similar looking lizard, if you didn't know. So between these top three classes that it predicted, this is almost 100%. between the three of these, I would say mobile net did a pretty good job on this prediction. So let's move on to number two. Alright, so now we are going to plot our second image. And this is a cup of well, I originally thought that it was espresso, and then someone called it a cup of cappuccino. So let's say I'm not sure I'm not a coffee connoisseur, although I do like both espresso and cappuccinos. This looks like it has some cream in it. So hey, now we're going to go through the same process of what we just did for the lizard image where we are passing that image passing this new image to our prepare image function, so that they undergoes all of the pre processing, then we are going to pass the pre processed image to the predict function for our mobile net model. Then finally, we are going to get the top five results from the predictions for this model relative in regards to the imagenet classes. So let's see. All right, so according to mobile net, this is an espresso, not a cappuccino. So I don't know. But it predicts 99% probability of espresso as being the most probable class for this particular image. And I'd say that that is pretty reasonable. So let me know in the comments, what do you think is this espresso or cappuccino? I don't know if mobile or if image net had a cappuccino class. So if it didn't, then I'd say that this is pretty spot on. But you can see that the other the other four predictions are all less than 1%. But they are reasonable. I mean, the second one is cup, third eggnog, fourth coffee mug. Fifth, wooden spoon gets a little bit weird, but there is wood, there is a circular shape going on here. But these are all under 1%. So they're pretty negligible. I would say mobilenet did a pretty great job at giving a 99% probability to espresso for this image. All right, we have one more sample image. So let's bring that in. And this is a strawberry or multiple strawberries if you call if you consider the background. So same thing we are pre processing the strawberry image. Then we are getting a prediction For a mobile net for this image, and then we are getting that top five results for the most probable predictions among the 1000 image net classes. And we see that mobile net with 99.999% probability classifies this image as a strawberry correctly, so very well, and the rest are well, well under 1%. But they are all fruits. So interesting. Another really good prediction from mobile net. So even with the small reduction in accuracy that we talked about, at the beginning of this episode, you can probably tell from just these three random samples that that reduction is maybe not even noticeable when you're just doing tests like the ones that we just ran through. So in upcoming episodes, we're actually going to be fine tuning this mobile net model to work on a custom data set. And this custom data set is not one that was included in the original image net library, it's going to be a brand new data set, we're going to do more fine tuning than what we've done in the past. So stay tuned for that. Be sure to check out the blog and other resources available for this episode on deep lizard calm, as well as the deep lizard hive mind where you can gain access to exclusive perks and rewards. Thanks for contributing to collective intelligence. Now let's move on to the next episode. Hey, I'm Mandy from deep lizard. And this episode, we'll be preparing and processing a custom data set that we'll use to fine tune mobilenet using tensor flows keras API. Previously, we saw how well VGG 16 was able to classify and generalize on images of cats and dogs. But we noted that VGG 16 was actually already trained on cats and dogs, so it didn't take a lot of fine tuning at all to get it to perform well on our cat and dog dataset. Now, with mobile net, we'll be going through some fine tuning steps as well. But this time, we'll be working with a data set that is completely different from the original image net library for which mobile net was originally trained on. This new data set that we'll be working with is a data set of images of sign language digits, there are 10 classes total for this dataset, ranging from zero to nine, where each class contains images of the particular sign for that digit. This data set is available on kaggle as grayscale images, but it's also available on GitHub as RGB images. And for our particular task, we'll be using the RGB images. So check out the corresponding blog for this episode on Peebles or.com, to get the link for where you can download the data set yourself. So after downloading the data set, the next step is to organize the data on disk. And this will be a very similar procedure to what we saw for the cat and dog data set earlier in the course. So once we have the download, this is what it looks like. It is a zipped folder called sign language digits data set master. And the first thing we want to do is extract all the contents of this directory. And when we do that, we can navigate inside until we get to the dataset directory. And inside here we have all of the classes. And each of these directories has the corresponding images for this particular class. So the what we want to do is we want to grab zero through nine. And we are going to Ctrl x or cut these directories. And we're going to navigate back to the root directory, which is here, we're going to place all of the directories zero through nine in this route. And then we're going to get rid of everything else by deleting. So now, one last thing. So I'm just going to delete this master. I don't necessarily like that. So we have sign language digits data set. And directly within we have our nested directories consisting of zero through nine, each of which has our training data inside then the last step is to move the sign language digits data set directory to where you're going to be working. So for me that is relative to my Jupyter Notebook, which lives inside of this deep learning with cares directory, I have a data directory and I have the sign language digits data set located here. Now everything else that will do to organize and process the data will be done programmatically in code. So we are in our Jupyter Notebook. Make sure that you do have the imports that we brought in last time still in place because we'll be making use of those Now, this is just the class breakdown to let you know how many images are in each class. So across the classes zero through nine, the there are anywhere from 204 to 208 samples in each class. And then here I have just an explanation of how your data should be structured up to this point. Now the rest of the organization, we will do programmatically with this script here. So this script is going to organize the data into train valid and test directories. So recall right now, we just have all the data located in the corresponding classes of zero through nine. But the data is not broken up yet into the separate data sets of train, test and validation. So to do that, we are first changing directory into our sign language digits, data set directory. And then we are checking to make sure that the directory structure that we're about to setup is not already in place on disk. And if it's not, then we make a train valid and test directory, right within sign language digits dataset. So next we are then iterating over all of the directories within our sign language digits dataset directory. So recall, those are directories labeled zero through nine. So that's what we're doing in this for loop with this range zero to 10. That's going from directory zero to nine, and moving each of these directories into our train directory. And after that, we're then making two new directories, one inside of valid with the directory for whatever place we're at in the loop. So if we are on run number zero, then we are making a directory called zero with invalid, and a directory called zero within test. And if we are on run number one, then we will be creating a directory called one with invalid and one within test. So we do this whole process of moving each class in to train and then creating each class directory empty, within valid and test. So at this point, let's suppose that we are on the first run in this for loop. So in this range, here we are following add a number zero. So here on this line, what we are doing is we are sampling 30 samples from our train slash zero directory. Because up here we created or we moved the class directory, zero into train. And now we're going to sample 30 random samples from the zero directory within train, then, so we're calling these valid samples, because these are going to be the samples that we move into our validation set. And we do that next. So for each of the 30 samples that we collected from the training set randomly in this line, we're now going to be moving them from the training set to the validation, zero set to the validation set in Class Zero. And then we do similarly the same thing for the test samples, we randomly select five samples from the train slash zero directory. And then we move those five samples from that train slash zero directory into the test zero directory. So we just ran through that loop using the Class Zero as an example. But that's going to happen for each class zero through nine. And just in case you have any issue with visualizing what that script does, if we go into sign language, digits dataset, now let's check out how we have organized the data. So recall we previously had classes zero through nine all listed directly here within the this route folder. Now we have train valid and test directories. And within train, we moved the original zero through nine directories all into train. And then once that was done, then we sampled 30 images from each of these classes and moved them into the valid directory classes. And then similarly, we did the same thing for the test directory. And then once we look in here, we can see that the test directory has five samples for zero, we see that it has five samples for one. If we go check out the valid directories, look at zero, it should have 30 zeros so See that here 30. So every valid directory has 30 samples, and the training directory classes have not necessarily uniform samples because remember, we saw that the number of samples in each class ranged anywhere from 204 to 209, I think. So the number of images within each class directory for the training sample will differ slightly by maybe one or two images. But the number of images in the classes for the validation and test directories will be uniform since we did that programmatically with our script here. So checking this dataset out on disk, we can see that this is exactly the same format, for which we structured our cat and dog image data set earlier in the course, now, we're just dealing with 10 classes instead of two. And when we downloaded our data, it was in a slightly different organizational structure than the cat and dog data that we previously downloaded. Alright, so we have obtained the data, we've organized the data. Now the last step is to pre process the data. So we do that first by starting out by defining where our train valid and test directories live on disk. So we supply those paths here. And now we are setting up our directory iterators, which we should be familiar with, at this point. Given this is the same format that we processed our cat and dog images whenever we build our cnn from scratch, and we use the fine tuned VGG 16 model. So let's focus on this first variable train batches. First, we are calling image data generator dot flow from directory which we can't quite see here. But we'll scroll into a minute to image data generator, we are passing this pre processing function, which in this case is the mobile net pre processing function. Recall, we saw that already in the last episode. And there we discussed how this pre processes images in such a way that it scales the image data to be rather than on a scale from zero to 255, to instead be on a scale from minus one to one. So then on image data generator, we call flow from directory, which is blowing off the screen here. And we set the directory equal to train path, which we have defined just above saying where our training data resides on disk, we are setting the target size to 224 by 224, which recall just resizes any training data to be a height or to have a height of 224 and a width of 224, since that is the image size that mobile net expects, and we are setting our batch size equal to 10. So we've got the same deal for valid batches. And for test batches as well, everything exactly the same except for the paths deferring to show where the validation and test sets live on disk. And we are familiar now that we specify shuffle equals false only for our test set so that we can later appropriately plot our prediction results to a confusion matrix. So we run this cell, and we have output that we have 17 112 images belonging to 10 classes. So that corresponds to our training set 300 images to 10 classes for our validation set, and 50 images belonging to 10 classes for our test set. So I have this cell with several assertions here, which just assert that this output that we find, or that we have right here is what we expect. So if you are not getting this, then it's perhaps because you are pointing to the wrong location on disk. So a lot of times if you're pointing to the wrong location, then you'll probably get found zero images belonging to 10 classes. So you just need to check your path to where your data set resides if you get that. Alright, so now our data set has been processed. We're now ready to move on to building and fine tuning our mobile net model for this data set. Be sure to check out the blog and other resources available for this episode on depot's or.com as well as the deep lizard hive mind where you can gain access to exclusive perks and rewards. Thanks for contributing to collective intelligence. Now let's move on to the next episode. Hey, I'm Mandy from Blizzard. In this episode, we'll go through the process of fine tuning mobile net for a custom data set. Alright, so we are jumping right back into our Jupyter Notebook from last time. So make sure your code is in place from then since we will be building directly on that now. So first thing we're going to do is we are going to import mobile net just as we did in the first mobile net episode by calling TF Kerris, the applications that mobile net dot mobile net, remember, if this is your first time running this line, then you will need an internet connection to download it from the internet. Now, let's just take a look at the model that we downloaded. So by calling model dot summary, we have this output here, that is showing us all of these lovely layers included in mobile net. So this is just to get a general idea of the model because we will be fine tuning it. So the fine tuning process now is going to start out with us getting all of the layers up to the sixth to last layer. So if we scroll up and look at our output 123456. So we are going to get all of the layers up to this layer, and everything else is not going to be included. So all of these layers are what we are going to keep and transfer into a new model, our new fine tune model. And we are not going to include the last five layers. And this is just a choice that I came to after doing a little experimenting and testing, the number of layers that you choose to include versus not include whenever you're fine tuning, a model is going to come through experimentation and personal choice. So for for us, we are getting everything from this layer and above. And we are going to keep that in our new fine tuned model. So let's scroll down. So we're doing that by calling mobile dot layers, passing in that six to last layer and output, then we are going to create a variable called output. And we're going to set this equal to a dense layer with 10 units. So this is going to be our output layer. That's why it's called output and 10 units due to the nature of our classes, ranging zero through nine. And this, as per usual is going to be followed by a softmax activation function to give us a probability distribution among those 10 outputs. Now, this looks a little strange. So we're calling this and then we're like putting this x variable next to it. So what's this about? Well, the mobile net model is actually a functional model. So this is from the functional API from Kerris, not the sequential API. So we kind of touched on this a little bit earlier, whenever we fine tuned VGG 16, we saw that VGG 16 was also indeed a functional model. But when we fine tuned it, we iterated over each of the layers and added them to a sequential model at that point, because we weren't ready to introduce the functional model yet. So here, we are going to continue working with a functional model type. So that's why we are basically taking all of the layers here up to the sixth the last. And whenever we create this output layer, and then call this the previous layers stored in x here, that is the way that the functional model works, we're basically saying to this output layer, pass all of the previous layers that we have stored in X, up to this six to last layer and mobile net. And then we can create the model using these two pieces x and output by saying by calling model, which is indeed a functional model when specified this way, and specifying inputs equals mobile dot input. So this is taking the input from the original mobile net model, and outputs equals output. So at this point, output is all of the mobile net model up until the six the last layer plus this dense output layer. Alright, so let's run these two cells to create our new model. Alright, so our new models now been created. So the next thing we're going to do is we're going to go through and freeze some layers. So through some experimentation on my own, I have found that if we freeze, all except for the last 23 layers, this appears to yield some decent results. So 23 is not a magic number here, play with this yourself and let me know if you get better results. But basically, what we're doing here is we're going through all the layers in the model, and by default, they are all trainable. So we're saying that we want only the last 23 layers to be trainable. All the layers except for the last one, a three, make those not trainable. And just so that You understand, relatively speaking, there are 88 total layers in the original mobile net model. And so we're saying that we don't want to train, or that we only want to train the last 23 layers in our new model that we built just above. Recall, this is much more than we earlier trained with our fine tuned VGG 16 model where we only train the output layer. So let's go ahead and run that now. And now let's look at a summary of our new fine tuned model. So here, if we just glance it looks basically the same as what we saw from our original summary. But we will see here that now now our model ends with this global average pooling 2d layer, which recall before was the sixth the last layer, where I said that we would include that layer and everything above it. So all the layers below the global average pooling layer that we previously saw and the original mobile net summary are now gone. And instead of an output layer with 1000 classes, we now have an output layer with 10 classes from the or corresponding to the 10 potential output classes that we have for our new sign language digits dataset. If we compare the total parameters, and how they're split amongst trainable and non trainable parameters, in this model with the original mobile, mobile that model, then we will see a difference there as well. Alright, so now this model has been built, we are ready to train the model. So the code here is nothing new. We are compiling the model in the same exact fashion using the atom optimizer 0.00, with our little bit of luck, 0.0001 learning rate, categorical categorical cross entropy loss and accuracy as our metric. So this, we have probably seen 2 million times up to this point in this course. So that's exactly the same. Additionally, we have exactly the same fit function that we are running to train the model. So we're passing in our train batches as our data set, we are passing in validation batches as our validation data. And we are running this for 10 epochs. Actually, we're going to go ahead and run this for 30. I had 10 here just to save time earlier from testing, but we're going to run this for 30 epochs. And we are going to set verbose equal to two to get the most verbose output. Now, let's see what happens. All right, so our model just finished training over 30 epochs. So let's check out the results. And if you see this output, and you're wondering why the first output took 90 seconds, and then we get or the first epoch took 90 seconds, and then we got it down to five seconds, just a few later, it's because I realized that I was running on battery and not on my laptop being plugged in. So once we plug the laptop in, it beefed up and started running much quicker. So let's scroll down and look basically just like how we ended here. So we are at 100% accuracy on our training set, and 92% accuracy on our validation set. So that is pretty freakin great considering the fact that this is a completely new dataset, not having images that were included in the original image net model. So these are pretty good results. A little bit overfitting here since our validation accuracy is lower than our training accuracy. So if we wanted to fix that, then we could take some necessary steps to combat that overfitting issue. But if we look earlier at the earlier epochs, to see what kind of story is being told here, on our first epoch, our training accuracy actually starts out at 74% among 10 classes, so that is not bad for a starting point. And we quickly get to 100% on our training accuracy, just within four epochs. So that's great. But you can see that, at that point, we're only at 81% accuracy for our validation set. So we have a decent amount of overfitting going on earlier on here. And then, as we progress through the training process, the overfitting is becoming less and less of a problem. And you can see that we actually at this point, if we just look at the last eight epochs that have run here, we've not even stalled out yet on our validation loss. It's not stalled out in terms of decreasing and our value. Our validation accuracy has not stalled out in terms of increasing so perhaps just running more epochs on this data will eradicate the overfitting problem. Otherwise, you can do some tuning yourself changing some hyper parameters around do a different structure of fine tuning on the model. So freeze Morrow. Less than the last 23 layers for during the fine tuning process, or just experiment yourself. And if you come up with something that yields better results than this, then put it in the comments and let us know. So we have one last thing we want to do with our fine tuned mobile net model, and that is use it on our test set. So we are familiar, you know the drill with this procedure at this point, we have done it several times. So we are now going to get predictions from the model on our test set. And then we are going to plot those predictions to a confusion matrix. So we are first going to get our true labels by calling test batches classes. We're then going to gain predictions from the model by calling model dot predict and passing in our test set stored in test batches here, setting verbose equal to zero, because we do not want to see any output from the predictions. And now we are creating our confusion matrix. Using socket learns confusion matrix that we imported earlier, we are setting our true labels equal to the test labels that we defined just here above, we are setting our predicted labels to the arg max of our predictions across x is one. And now we are going to check out our class indices of the test batches just to make sure they are what we think they are. And they are of course, classes labeled zero through nine. So we define our labels for our confusion matrix here, accordingly. And then we call our plot confusion matrix that we brought in earlier in the notebook and that we have used 17,000 Times up to this point in this course, and we are passing in our confusion matrix for what to plot, we are passing in our labels that we want to correspond to our confusion matrix, and giving our confusion matrix, the very general title of confusion matrix, because hey, that's what it is. So let's plot this. Oh, no. So plot confusion matrix is not defined? Well, it definitely is just somewhere in this notebook, I must have skipped over here we go. Nope, here we are. Alright, so here's where plot confusion matrix is defined. Let's bring that in. Now it is defined and run back here. So looking from the top left to the bottom right, and now we see that the model appears to have done pretty well. So we have 10 classes total, with five samples per class. And we see that we have mostly we have all fours and fives across this diagonal, meaning that most of the time the model predicted correctly. So for example, for a nine, five times out of five, the model predicted an image was a nine when it actually was for an eight. However, only four out of five times did the model correctly predict looks like one of the times the model, let's see, predicted a one when it should have been an eight. But in total, we've got 12345 incorrect predictions out of 50 total. So that gives us a 90% accuracy rate on our test set, which is not surprising for us, given the accuracy that we saw right above on our validation set. So hopefully this series on a mobile net has given you further insight to how we can fine tune models for custom data set and use transfer learning. To use the information that a model gained from its original training set on a completely new task in the future. Be sure to check out the blog and other resources available for this episode on the poser.com as well as the deep lizard hive mind where you can gain access to exclusive perks and rewards. Thanks for contributing to collective intelligence. Now let's move on to the next episode. Hey, I'm Andy from deep lizard. And in this episode, we're going to learn how we can use data augmentation on images using tensor flows. keras API. Data augmentation occurs when we create a new data by making modifications to some existing data. We're going to explain the idea a little bit further before we jump into the code. But if you want a more thorough explanation, then be sure to check out the corresponding episode in the deep learning fundamentals course on depot's or.com. For the example that will demonstrate in just a moment, the data we'll be working with is image data. And so for image data specifically, data augmentation would include things like flipping the image, either horizontally or vertically. It could include rotating the image, it could include changing the color of the image, and so on. One of the major reasons we want to use data augmentation is to simply just get access to more data. So a lot of times that not having access to enough data is an issue that we can run into. And we can run into problems like overfitting if our training data set is too small. So that is a major reason to use data augmentation is to just grow our training set, adding augmented data to our training set can in turn reduce overfitting as well. Alright, so now let's jump into the code to see how we can augment image data using Kerris. Alright, so the first thing that we need to do, of course, is import all of the packages that we will be making use of for this data augmentation. Next, we have this plot images function, which we've introduced earlier in the course, this is directly from tensor flows website. And it just allows us to apply images to our Jupyter Notebook. So check out the corresponding blog for this episode on deep lizard calm to get the link so that you can go copy this function yourself. Alright, so next we have this variable called Gen, which is an image data generator. And recall, we've actually worked with image data generators. earlier in the course, whenever we create our train at test and valid batches that we were using for training seeing it and with this, though, we are using image data generator in a different way. Here, we are creating this generator that we are specifying all of these parameters like rotation, range, width, shift range, heights of range, sheer range, zoom range channels of drains in horizontal flip. So these are all options that allow us to augment our image data. So you need to check the documentation on tensor flows website to get an idea of the units for these parameters, because they're not all the same. So for example, rotation range here, I believe is measured in radians. Whereas like this width, shift range is measured as a percentage of the width of the image. So these are all ways that we can augment image data. So this is going to be rotating the image by 10 radians, this is going to be shifting the width of the image by 10%, the height by 10%. Zooming in shifting the color channels, flipping the image, so all sorts of things. So just all different ways that we can augment image data. So there are other options to just be sure to check out the image data generator documentation if you want to see those options. But for now, these are the ones that we'll be working with. So we store that in our Jen variable. Next, we are going to choose a random image from a dog directory that we had set up earlier in the course under the dogs versus cats dataset, we're going to go into train into dog, then we are going to choose a random image from this directory. And then we're going to set this image path accordingly. So we're just going to set this to point to whatever that chosen image was on disk, then we have this assertion here just to make sure that that is indeed a valid file before we proceed with the remaining code. Now we're just going to plot this image to the screen. And I'm not sure what this is going to be since it is a random image from disk. So that is a cute looking. I don't know. Beagle, Beagle, basset hound mix, I don't know, what do you guys think. So this is the random dog that was selected from from our dog trained directory. Now we are creating this new variable called all etre. And to our image data generator that we created earlier called Jin, we're calling this flow function and passing our image in to flow. And this is going to generate a batch of augmented images from this single image. So next, we are defining this og images variable, which is going to give us 10 samples of the augmented images created by og etre here. Lastly, we are going to plot these images using the plot images that we defined just above. All right, so let's zoom in a bit. All right, we can see now that first let's take a look at our original dog Alright, so that is the original image. Now we can look and see that given those things like rotation and width shift, and everything that we defined earlier, whenever we defined whenever we defined our image data generator that has now been done to All of these images in one random way or another so we can see kind of what's happening here. So for example, this particular image looks like it has been shifted up some because we can see that the head of the dog is being cut off a little bit. And this image, or let's see which way was the dog originally facing, so its head is facing to the right. So yeah, so this image here has been flipped, the dog is now facing to the left. And this image appears to be shifted down some. And so some of these, like this one looks like it's been rotated. So we can get an idea just by looking at the images, the types of data augmentation that have been done to them. And we can see how this could be very helpful for growing our dataset in general, because for example, say that we have a bunch of images of dogs, but for whatever reason, they're all facing to the left. And we want to deploy our model to to be some general model that will classify different dogs. But the types of images that will be accepted by this model later might have dogs facing to the right as well. Well, maybe our model totally implodes on itself whenever it receives a dog facing to the right. So through data augmentation, given the fact that it is very normal for dogs to face left or right, we could augment all of our dog images to have the data or to have the dogs also face in the right direction, as well as the left direction to have a more have a more dynamic data set. Now, there is a note in the corresponding blog for this episode on people's are calm, giving just the brief instruction for how to save these images to disk if you want to save the images after you augment them, and then add them back to your training set. So check that out if you're interested in doing that to actually grow your training set. Rather than just plot the images here and your Jupyter Notebook. By the way, we are currently in Vietnam filming this episode. If you didn't know we also have a vlog channel where we document our travels and share a little bit more about ourselves. So check that out at beetles or vlog on YouTube. Also, be sure to check out the corresponding blog for this episode, along with other resources available on the blizzard.com. And check out the people's hive mind where you can gain exclusive access to perks and rewards. Thanks for contributing to collective intelligence. I'll see you next time.
Info
Channel: freeCodeCamp.org
Views: 496,123
Rating: undefined out of 5
Keywords: karas, tensorflow, deep learning, machine learning
Id: qFJeN9V1ZsI
Channel Id: undefined
Length: 167min 54sec (10074 seconds)
Published: Thu Jun 18 2020
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.