[DL] Deep transfer learning - using VGG16 convolutional base

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
to demonstrate transfer learning in this video i will show how to use the vgcc 16 convolutional base for let's say building a model for the c510 dataset now let me briefly talk about the c510 dataset uh c410 is a collection of images that are commonly used to train machine learning models um and computer vision algorithms it is one of the most widely used data sets for machine learning the data set contains 60 000 32 by 32 color images in 10 different classes you can think of this as a different type of amnesi data set while mnist is only digit classification c410 is an image classification but since data sets like imagenet are really really big cfr 10 on the other hand is extremely small the 10 different classes represent these things in the real world airplane cars etc and there are around 6000 images in each class so to download this data set into memory it's pretty easy all i have to do is um tensorflow dot cluster data sets import c410 and then do c410.loaddata and this gives me the data set in four different numpy arrays x train y train x test and white test so um if we visualize this x strain is 50 000 images 32 by 32 each color images and y train is 50 000 labels so let's randomly plot one of the input images to see how it looks like so looks like the fifth training image train image 4 is the picture of a car now say instead i would like to visualize the sixth input image and here's the six input image this is also a picture of a car and so on now let's look let's convert the data set that is our labels into two categorical because um so that it's ready for training we have to convert those into one hot encoded output levels so our fifty thousand labels become fifty thousand by ten and our ten thousand uh test levels become ten thousand by ten so now we have our data ready to be trained the first method is to develop our own convolutional neural network that is we simply have and our input images of shape 32 by 32 by 3 and then i can have a first convolutional layer with 16 filters in it and then i can have second convolutional layer with four filters in it and then i can flatten it at dense layers and in this way this is just a tiny model but we can build um any model that we want and then train it so let's build this model and so our model is ready only 32 000 parameters and then say i would like to train it right so let's say i set the number of epochs to only four and batch size to something slightly very higher so that i can get results much more quickly and we noticed that the violation accuracy is 39 43 and 43 percent and so on right so now i can also plot the history and um and like the learning curves and see um how the training went so that was the first method that we all know so far now let's look at the second method that is how to use the vg16 convolutional base so to do this we have to first download the vg16 convolutional base to do this i can do tensorflow.kerastore applications import vg16 and then vg16 give me the vcc16 convolutional base the weights for this model must be trained using imagenet keras already comes packaged with this and then i also did include top equals false what this means is i'm only requesting the convolutional base part of the model pretend and i'm also asking the input shape to be resave to 32 by 32 by 3 so that now i can use the c410 data set because otherwise this is 16 by default is not trained for 30 by 323 input now the next thing i would like to do is visualize this coin-based summary to see how it looks like and here it is it accepts an input of 32 by 32 by 3. this is the input it accepts and then um what it produces as output is one by one by five twelve a very deep representation of the input so so there's no dense layer here this is all convolutional layers and max pooling layers so the next thing that we would like to do is take our inputs that is our x test and x train as input and pass them through the convolutional base and obtain the representations of our input that would be produced from the bg16 base so let's run this and what this gives us in these two numpy arrays that is the x test vdg output and x2 and vdz output is the the the uh different kind of representations of our input images which is learned by the convolutional base of vcg16 now since these are latent representations the number of channels for each of these is the output is going to be 1 by 1 by five twelve for each image so um in the test data set what we have is ten thousand images and then these ten thousand images each are represented as one by one by five to all volume and the train images 50 000 are represented by one by one by five twelve volume so for each image now we have these five twelve numbers or this one by one by five twelve numpy array representing what the input image means now the next thing that we would like to do is reshape our input or in a way actually flatten these representations so that they can be fair to a dense layer so here all that we are doing is uh we are taking the x test vcc output and we are reshaping them to so these are 10 000 images so we are keeping the number of images same but then now this is effectively basically only 512 so i can either have one by one by five twelve or simply um flatten it out and only put 512 regardless it's the same thing and we're simply flattening these representations so that these become like each image now becomes like a vector that can be fed to a deep learning model so now if you look at the shape this becomes ten thousand by five twelve and fifty thousand by five twelve so now our input images in a way that they become like a table or a vector or a row in a table or data so now we can simply build um a sequential model with dense layers let's say i would like to add 5 256 neurons in the first layer and then i would like to add a drop out of with 0.5 and then finally dense there that predicts 10 values right um and then the input dimension for my um for this model will be 5 12 values and then this 1 by 1 by 5 12 is basically five to all and then i can do model at compile and then run model.fit so this time let's build this model and then train slightly longer say for example six and let's actually do 32 epochs and then let's train this model so what we have done here is taken these representations captured by the vcc16 base and then um and then we are feeding those as inputs to our new neural network which is simply all um only the the classifier layer or the dense layer part of our network remember our entire network is actually much bigger because it includes the bgc 16 base as well so we see that the accuracy on the validation set is around 60 percent now on the train data set the accuracy is 61.2 percent and so on and maybe uh this is this is actually what we get right now we can maybe further choose a slightly different convolutional base or even do fine tuning to further improve the accuracy that that is unfreeze some of the convolutional layers in the vgc 16 network to further raise the accuracy on the violation set so we have the training complete of course you can incorporate all the stopping and many of the stuff to obtain higher accuracy and then here's the the model accuracy or the learning curve showing the training accuracy and the violation accuracy over a box
Info
Channel: Badri Adhikari
Views: 429
Rating: 5 out of 5
Keywords:
Id: 2xrzJibPl3c
Channel Id: undefined
Length: 9min 9sec (549 seconds)
Published: Mon Apr 26 2021
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.