Fine-tuning a Neural Network explained

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
[Music] in this video we'll discuss what fine-tuning is and how we can take advantage of it when building and training our own artificial neural networks fine tuning is very closely linked with the term transfer learning transfer learning occurs when we use knowledge that was gained from solving one problem and then apply it to a new but related problem for example knowledge gained from learning to recognize cars could potentially be applied in a problem of recognizing trucks fine tuning is a way of applying or utilizing transfer learning specifically fine tuning is a process that takes a model that's already been trained for a given task and then tuning or tweaking that model to make it perform a second similar task assuming the original task is similar to the new task using an artificial neural network that's already been designed and trained allows us to take advantage of what the model has already learned without having to develop it from scratch when building a model from scratch we have to try many approaches in a type of trial-and-error since we have to choose how many layers we're using what types of layers were using what order to put the layers and how many nodes to include in each layer we have to decide how much regularization to use what learning rate to use etc so with this it's easy to see that building and validating our own model can be a huge task in its own right depending on what data we're training it on that's what makes the fine-tuning approach so attractive if we can find the Train model that's already done one task well and that task is similar to ours in at least some remote way then we can take advantage of everything that model has already learned and apply it to our specific task now of course if the two tasks are different and there will be some information that the model has learned that may not apply to our new task or there may be new information that the model needs to learn from the data regarding the new task that wasn't learned from the previous task for example a model trained on cars is not going to have ever seen a truck bed so this features something new the model would have to learn about if we're going to be training it to recognize trucks but think about everything else that our new model for recognizing trucks could use from the original model that was trained only on cars this already trained model has learned to understand edges and shapes and textures and more objectively headlights door handles windshields tires etc all of these learned features are definitely things we could benefit from in our new model for classifying trucks so this sounds fantastic right but how do we actually technically implement this so going back to the example we just mentioned if we have a model that's already been trained to recognize cars and we want to fine-tune this model to recognize trucks we can first import our original model that was used on the cars problem then for simplicity purposes let's say we removed just the last layer of this model this last layer would have been previously the layer for classifying whether an image was a car or not after removing this we want to add a new layer back whose purpose is to classify whether an image is a truck or not now in some problems we may want to remove more than just the last single layer and we may want to add more than just one layer this will depend on how similar the task is for each of the models since layers at the end of our network may have learned features that are very specific to the original task at hand where layers at the start of the network may have learned more general features like edges shapes and textures so after we've modified the structure of the existing Network we then want to freeze the layers in our new model that came from the original model by freezing I mean that we don't want the weights for these layers to update whenever we train the model on our new data for our new task we want to keep all of these weights the same as they were after being trained on the original task we only want the weights in our new or modified layers to be updating so after we do this then all that's left is just to train the model on our new data again during this training process the weights from all the layers we kept from our original model are going to stay the same and only the weights and our new layers will be updating now if you want to see all of this in action I have separate videos in my Kerris playlist that will show you how to build a fine-tuned model how to then train that model and then use the model to predict on new data definitely check those out if you want to see the technical implementation of fine tuning written code so hopefully now you have an idea about what fine-tuning is and why you may want to use it in your own models and I hope you found this video helpful if you did please like this video subscribe suggest and comment and thanks for watching [Music]
Info
Channel: deeplizard
Views: 82,289
Rating: undefined out of 5
Keywords: Keras, deep learning, machine learning, artificial neural network, neural network, neural net, transfer learning, AI, artificial intelligence, Theano, Tensorflow, CNTK, tutorial, cuDNN, GPU, Python, supervised learning, unsupervised learning, Sequential model, image classification, convolutional neural network, CNN, categorical crossentropy, relu, activation function, predictions, stochastic gradient descent, educational, education, fine-tune, data augmentation
Id: 5T-iXNNiwIs
Channel Id: undefined
Length: 4min 48sec (288 seconds)
Published: Wed Nov 22 2017
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.