What is Transfer Learning? | With code in Keras

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
you probably heard of the term transfer learning a bunch of times already but what does it mean why do people use it and how can you get started let's answer all of those questions in this video transfer learning is taking all or part of a model that is already developed on a certain task and using its learnings on a whole new task the base model that is already trained is called a pre-trained model and the practice of adjusting it to a whole new task is called fine tuning one example of this could be that you make a really big model to detect objects and images like a microphone a table a computer a backpack etc and then you find units to classify images of flowers into their respective species and this is exactly what we will build in the second part of this video with keras okay but how does this all work so there are a couple of ways how you can achieve transfer learning number one is you can get a pre-trained model and only get rid of the output layer of this pre-trained model and then you can add your own output layer another option is to get rid of more layers of this model and add in more layers and also an output layer of course how many layers you get rid of and how many layers you're going to add is going to depend on how similar these two tasks are to each other after removing or adding some new layers what you need to do is fine tune this model so basically train it with the specialized data set that you have let's see this on an example let's say we have a cnn network very simple we train this model to recognize faces in images let's say and now we want to take this model and make a more specialized model that can recognize dog faces in images as you might remember the deeper you go in your deep neural network the higher level features you are going to be able to recognize so if the earlier layers in your model can recognize lines straight lines horizontal lines and maybe a little bit further you can recognize a circle or different objects and maybe even further you can recognize an eye feature a nose feature an ear feature etc and probably towards the end of your model the last layers will be able to recognize either part of a human's face or all of a human's face and those are the layers that we don't actually want because we want to recognize dog faces so in this case we want to keep the layers up until here so we don't have to reinvent the wheel from the beginning we already have that information parsed for us and we also want to add a couple of new layers that will learn how to recognize dog faces given all the information that is passed to us after we set up the architecture what we do is train our model with the dog images that we have and hopefully we will get a really good model that can recognize dog faces okay but why use transfer learning at all why don't we just make a model from scratch and train it with the data set of dog images that we have if you have enough images of dogs of course that would be the way to go but the thing transfer learning is solving is mainly a lack of data let's say we have 1 million labeled photos featuring human faces and only 5 000 featuring dog faces if you're only using the dog photos you will not get very far with complex architectures because they need a lot more data to train but by using a similar task of recognizing human faces you are giving your model a head start effectively exploiting the knowledge gained from a similar task on top of the lack of data problem transfer learning helps us save time once a generalized model is built a lot of people can find unit for their specialized tasks and through this time saving and sharing of generalized models transfer learning contributes to lowering the carbon footprint of training models okay but how can you get started with transfer learning so the first step is to find a pre-trained model luckily for us there are many places on the internet where you can get pre-trained models some of these are even built in in some of the deep learning frameworks like tensorflow hub keras applications and pytorch pre-trained models on top of this there are also some websites and apis you can use to get pre-trained models one example that is very popular is hugging face that most of the time hosts transformer based models you can also go and download models from model zoo it most of the time links to the model's github page one thing to note here is that if you're downloading a model from github make sure you check their license so that you're not doing anything that's not allowed with this model all right let's build the example that we talked about now so now we're going to do transfer learning with the model called exception this model is pre-trained and is provided to us through the keras applications library it was originally trained on 350 million images that included 17 000 classes our goal will be to change this model to classify photos of flowers into their species the data set we're going to use for that is called tf flowers there is a really nice and comprehensive list of data sets in the tensorflow library if you want to learn more about your data set you can always go here and kind of check it out what the photos look like so here are some examples of what the daisy class photos look like you can learn more about the images if you click on them for example their dimensions their mode etc you can also learn more about the image quality metadata etc so at first what we want to do is to of course import the tensorflow datasets library and the tensorflow library and then i'm going to import the data set when you set with info to true it also gives you some helpful information about this data set so i can show you what that looks like it basically includes the number of examples that are in this data set the authors of this data set the people who created this data set etc and also the labels of course that are included in this data set we also want to note this separately so that's what we're going to do in this next cell one thing you might realize is that this data set only provides us one split so one group of photos it is not separated into training testing and validation data sets so that's why we have to do it ourselves here next we have to make sure that our photos are ready to be fed to the exception model so this model takes 224 to 224 pixel images that's why we have to first resize them and then there is a very nice and helpful pre-process input function that is again provided by keras and once we set this function all we have to do is to make sure that all of our data has been run through this function and that's exactly what we do here we set the batch size and then we also run all of the different sections of this data set through the pre-processing function so for the actual transfer learning part of this tutorial the first thing we want is the exception model and by getting the exception model what we're getting is all of its layers and all of its already trained parameters but one thing that i don't want is the top layer this top layer includes the average pooling layer and the output layer that classifies the images that it has instead of the output layer or instead of the top layer of this model i'm going to create my own average layer and my own output layer and after i defined my layers i can combine them all together to create the model that i want to fine-tune before we start the fine-tuning training process what we want to do is to freeze the base model's layers freezing basically means that during the fine tuning these parameters will not be touched they will not be updated the only parameters that need to be updated are the ones that we just created here i'm setting the trainability of all of the layers inside the base model to false so effectively they're frozen next thing to do is to compile this model and start fine-tuning it so just to go through the parameters a little bit we are using gradient descent learning grade is quite high because we want the last two layers that we just created to learn quickly and i'm only training this model for five epochs because i want the newly created top layers to catch up with the rest of the model before you start running this one thing you can use in collab which will save you a lot of time is to change your runtime to gpus the way you can do that is go and click run time and then change runtime type and select gpu here and that's that easy and after you say save that means that now you can run your model or train your model on the gpu all right fine tuning is done for this model you might realize that after a couple of epochs i have not actually made that much progress in terms of validation loss and accuracy you can of course try training this a little bit longer but that might also mean that your top layers can actually perform well now compared to the rest of the model and now you might need to train the model holistically a little bit so for that all you need to do is unfreeze the previous layer so the base models layers and train your model again one thing you need to do before you continue with that is of course compile your model again and ideally you would set up way lower learning rate so you're not going to mess with the parameters of the base model too much so let's train this and see if we can see a performance increase all right so fine tuning is done again we definitely have seen some improvement in the validation accuracy if you want to train it longer you could probably get to even better accuracy also but it is that easy to fine-tune a pre-trained model all you have to do is select your data set prepare your data set so that it fits the pre-trained model that you selected and import your data import your model import your pre-trained model make sure that you do not include the top layers create your own top layers as many as you need and then compile your model and train it that's all you have to do if you have any questions about how this happened or transfer learning in general or if you want us to make a similar tutorial on other deep learning frameworks leave a comment and let us know but for now thanks for watching and i will see you in the next video you
Info
Channel: AssemblyAI
Views: 23,950
Rating: undefined out of 5
Keywords:
Id: DyPW-994t7w
Channel Id: undefined
Length: 10min 51sec (651 seconds)
Published: Thu Apr 07 2022
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.