TensorFlow Tutorial 11 - Transfer Learning, Fine Tuning and TensorFlow Hub

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
in this video i will show you how to use pre-trained models including how to freeze layers and do fine-tuning so let's get to it right after this beautiful intro all right so the code in front of us right now uh should hopefully feel very familiar we've been using these for uh pretty much all of the videos except this command right here uh tensorflow hub so i'm gonna go into what that is a little bit later in the in the tutorial but uh first of all if you if you don't have it i just google conda tensorflow hub and you'll get to this page it's also gonna be in the description and you can download it with this command if you're using conda if you're using pip you're just going to do pip install tensorflow hub that's it and so so what i want to do in this video is essentially three things i'm going to first show you if you have a pre-trained model that you've previously trained so check out my previous video on saving and loading models if you're unfamiliar with that and then i'll i'm going to show you how to use a pre-chain keras model so keras has a bunch of pre-trained models that you can just easily import and then lastly i'm going to show you how to load pre-trained models from tensorflow hub so tensorflow hub has a bunch of pre-trained models and i'm gonna as i said i'm gonna go into that a little bit later so first of all for this pre-chain for our own pre-trained model i'm just going to copy in sort of the data set data set loading so dmns data set right here and so we've done this multiple videos and then i'm just going to load a pre-trained model so model is carous.models.load model and this could be i mean if you've trained it so it's in the pre-trained uh folder so it uh so this could be a sort of a model that you've trained or i guess it could be a model that you've found on github or something like that so you would just load that model and then you could do printmodel.summary and then and then what you would do is you would check uh which of the parts you want right if you if this is if you want the entire model then that's just loading the model now you can continue training but normally when we're doing uh transfer learning we're going to pick out a couple of layers so let's say we want so let's say we want everything except the last one right let's say we have we have let's say that this number of classes would be 1000 for imagenet or something like that but for mnist we would have 10 classes then we would have to replace this last dense layer with our own but we could use sort of the the previous layers from from that particular model so how we would do that is uh we would do sort of base inputs we would do model dot layers we would take the layer that we want just for it to start in and that's the the first one so we're going to do model layers 0 and then dot input then we're going to do base output and why i'm calling it base is because we're going to use this pre-chain model as our base model and then we're going to you know sort of make a layer on top of that one so what what we do here is that we sort of check which uh either by just checking sort of zero one two three four five six seven so you want the output from the seventh one or you could count from backwards so you could do uh this is minus one and then minus two so what we're going to do here is we're gonna do model.layers uh and then -2 and then dot output so we want the output from this flattened layer right here in other words we're removing this layer the dens last layer so there are other ways to do this as well you could also do it with get layer and then sort of take the name uh yeah we're just gonna stick with the index that's fine and then we're gonna do uh output we're gonna build our own so this could be a model a sequential model or something like that we're just gonna add a single layer so all we're going to do is layers then 10 10 output nodes and then of base outputs right so we're going to first run it through here and then the output of base outputs is going to be sent into this this final output and then so i guess we could call it that final outputs and then we're going to do model equals keras.uh no wait we're going to do yeah keras.model base inputs so inputs is equal to base inputs outputs is equal to final output and outputs i guess because it's i have multiple so uh then uh we could do print um model dot summary and uh perhaps this should also be called something else than model so we could you know we could call this a new model since we've changed the the other one uh interesting interestingly we didn't actually change anything we just replaced it since the previous model that we had pre-trained it had this exact layer but you sort of get the point that you could replace this with whatever you want with different number of classes and so on um so this is just a sort of a simple example to illustrate how you would actually do it and so if we now print the new model.summary we'll uh yeah so as i said we'll get the exact same model but this one here is uh is now a different one so if we for example would change this to 15 then the sort of the final layer would now have 15 output nodes right here but of course we want 10 in this case and then you would do as normal so you would i'm going to copy in the compile and the fit i don't think it's very relevant we've seen that in previous videos so you would just do in this case new model.compile and then new model.fit so uh we could now run this all right so after just a single epoch we can see that it has over 97 accuracy and this is only so this sort of suggests that uh the pre-training had some effect uh now also so we can see here that yeah after three bucks it has almost 99 and what you could also do is let's say that this pre-trained model you don't want to actually train the entire thing uh which you know would be the case if you want to do fine tuning so what we would have to do is we would have to freeze the layers from this pre-trained model and how you could do that very simply is do model.trainable train able equals false so that's going to freeze all of the layers and one other thing you could also do is you could iterate through the layers of the model so for layer in model.layers and then you could do yeah in this case we've already set every layer to not be trainable so we could do something like assert later trainable is false but what you could also do is uh if we hadn't done this one liner right here for example if you wanted to just change specific layers uh you could also iterate through specific layers like doing one to five or something like that and then you could do layers dot trainable is false yeah so just two different ways of doing the same thing here yeah so i noticed one error in when editing the video i wrote layers that trainable is false and uh yeah i'm not sure really sure what that that does exactly but we want to do a layer that trainable is false so that was just a typo right there uh some for some reason it still ran i'm not sure what difference it made but this is what we you know what we want to do and and that is already done in this one liner so it doesn't really matter but if you would iterate your layers this is how you would do it the the the benefit of doing this is if we would now run this rerun it and now i don't know if you if you saw it but it took about 15 seconds or 16 seconds or something like that to run one epoch and so if we now run it without we will see that it's almost half the actual time so the benefit of doing this fine tuning and freezing layers is that it's going to run much faster um and sort of this is the i guess the common use case of uh pre-chain models is that you would take this gigantic model uh freeze the layers up to some certain point and then just add a couple of linear layers at the end for your specific for your specific use case and uh yeah so actually we can see here also is that it's uh it actually got better performance in this case sort of the same um and that's because these these layers that i've imported had already been trained on mnist previously so i guess that's one scenario when you're just you know you have your own model or you've loaded your own model i'm going to remove the code for this right now and we're going to move on to the next one so that's if you want to use a pre-trained keras model so so the keras library has a lot of models uh that you can import very easily and i'm going to show you just a just a use case of that and it's going to be very similar to what we did but using the the keras api for those models so so let's just create some random data just for demonstration to run the model so we're going to do tf.random.normal and then shape we're gonna do uh let's say three examples of five examples uh 299 299 by three and uh this is just that it's gonna fit the model that we're gonna import so i'll show you in just a second but those are for the the x label or sort of the the features so they are images of 299 and then three channels for rgb and then we have tf constant and let's say zero one two three four so five classes and they are all of a different unique class and then model is keras.applications and then here there are a bunch of different models you can use and so i'm gonna pick inception v3 and then there are a bunch of arguments that you can send in here and you can read more of the official documentations but one of the most important ones are that you can do just include top equals true or false and essentially what you can do here is that for the for the last final fully connected layers you could remove those and just obtain sort of feature vectors that you can then send into your own sequential model or something like that so this is probably one of the most important arguments and uh we could just do that first of all and then let's do model.summary just to see what it looks like yeah so what we can see here is that um in this case actually for the inception v3 there's only a fully connected at the absolute end so i'm not assuming you're familiar with the inception module but essentially it has these uh concatenations of of different uh convolutional networks at this layer and then it's doing a global average pooling and then at the end it's doing a fully connected so if you would do include top false it would just remove this fully connected layer at the end but uh let's say we just want to start with this one so what we're going to do now is uh we could do this very similar to what we did previously you could do base input is a model that layers zero dot input and then you could do base output is uh these outputs is model.layers and then let's say again we just want to remove the last fully connected uh but of course if it would be several fully connected at the end you could do minus three minus four sort of removing the exact amount that you want in this case we just want to remove the last one and then dot output and then we're going to do sort of our last so we'll do final outputs again we're just doing a single layer so layers tens then five nodes because we have five classes and then base outputs so very similar to what we did previously and then new model is keras.model inputs equals base inputs and then outputs equals final output then again you would just sort of get the compile so i'm just going to copy that in just to save some time so uh the compile so we're just using atoms sparse categorical nothing nothing new and then we're gonna do new model dot fit so let's do new model dot fit x and y and then epochs let's say 15 and then verbose equals two and this should be very quick right we just have five examples five random data points so let's see if it can overfit five data points using this gigantic inception v3 network base inputs is not defined all right so [Music] i'll base inputs right there all right so training on these uh 15 epochs went pretty quickly and as we can see it got 100 accuracy and the loss is very low so but of course the this was just a demonstration of how you would import stuff using uh this kerastut applications models and so so what i want to show you now is how to use the tensorflow hub and so tensorflow hub essentially and it's just tfhub.dev it's essentially where you can get a lot of different models pre-trained models for different scenarios so let's say we just want images then you can sort of uh yeah so there's a lot of models here that you can just you can go through and just check uh in this case let's say we want the inception v3 again so we can just go to this this one right here and then uh it has this is for the feature vector so this is uh similarly to the keras one where you could include top equals false that would give you a feature vector and that's exactly what this does so they've separated the inception v3 for a model that has the top of the fully connected ones and then one that just returns the feature vector and then what you do is you just copy the url and when you have the url and then you can go back to the code so again let's do some random data so let's do tf randomnormal shape is uh 5 299 299.3 sort of exactly what we just did for the loading from the keras and again we're loading the exact same model so this is not going to be anything new just showing you how to do it with tensorflow hub so then you would do a url equals and then you would paste the url right here and then you would do base model and then hub equals keras layer url and then input shape 299.299.3 and then what you would do is you would do model equals keras sequential and you would uh then do the base model right which is not including the fully connected layers and then you can add whatever layers you want so layer tens 128 nodes activation equals relo layer stands 64 activation equals relu and then yeah let's do one final layer stands of 10 up 10 up five output nodes we just have five classes and then again you would do model.compile and model.fit so i'm just gonna copy those in and then uh i guess sort of one thing i missed to do in for the keras model is that of course you can do as the exact first example that we did so you could do base model.trainable equals false uh so that you're using you're doing fine tuning i should have probably showed that also for the other one because this is this is something you can do for all models and uh yeah just for the example let's try and run this and see what it looks like all right so in this scenario it didn't actually overfit too much so we would have to probably run it for longer but as we can see at least it obtained a 100 accuracy and uh yeah that's pretty much it those are the examples i wanted to show you um sort of different ways you can do pre-training and fine-tuning freezing layers and so on uh hopefully this video was useful for you um if you have any questions then leave them in the comment and uh hope to see you in the next video [Music]
Info
Channel: Aladdin Persson
Views: 39,812
Rating: undefined out of 5
Keywords: tensorflow transfer learning, tensorflow 2.0 transfer learning, tensorflow fine tuning, tensorflow 2.0 fine tuning, tensorflow 2.0 tensorflow hub, tensorflow hub transfer learning, tensorflow transfer learning tutorial
Id: WJZoywOG1cs
Channel Id: undefined
Length: 17min 11sec (1031 seconds)
Published: Wed Aug 19 2020
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.