How To Train Deep Learning Models In Google Colab- Must For Everyone

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
hello all my name is krishnak and welcome to my youtube channel so guys today in this particular video i'm going to actually show you how you can train your deep learning models in google collab now why i'm making this particular video guys because i see many people you know they cannot just afford uh you know gpus and they cannot just use it for training but google collab is a wonderful option you can train any kind of things you know with respect to the data with respect to the size of the data if you have a huge data also then also you'll be able to train but yes you have to be a little bit patient because it may take a little bit more time but google collab is a boon for every students and for the people who cannot afford gpus or heavy laptops or heavy workstations so this is the way that you should try to learn in my initial career of days in my data science industry i have used google collab a lot you know in terms of executing training your models and all so in this particular video i'm going to show you an example how you can train or deep learning models inside the google collab it will be pretty much amazing and uh just follow the steps that i'm actually going to say you and why this is important because in the future i'll be coming up with videos related to bert and all and probably that i'll not be doing in my workstation because everybody do not have a gpu so what i have thought is that you know i will be training them in the google collab and showing you all the examples which is pretty much important and this is one of the feedback that i got from some of my subscribers so let's go ahead and try to uh you know train this particular deep learning model now in this example what i'm going to do is that guys i've got a data set which is called as tomato leaf disease classification data set now here you'll be able to see that in your data set you have to strain and valid data set and you won't believe guys how many categories you have here you have somewhere around one two three four five six seven eight nine ten ten categories of different types of plant disease with respect to tomato leaves so you can download this this whole file size is somewhere around 367 mb and i have actually downloaded this particular data set so here is all the images that you have so if i open this images guys you'll be able to see all these images so this is with respect to different different leaves diseases so this kind of disease is called as pepper bell bacterial spot suppose if you really want to see the healthy leaves so this is what is the healthy leaves with respect to the tomato uh leaves classification okay and apart from that you also have different different things like bacterial splat spotlight something like this so here you can see this all are the example of that so it is uh nicely classified and really this images are somewhere around i guess it is it is it is having basically many images okay so let's see how quickly it will be done so first thing is that just download this whole data set and google collab you know first of all uh this is the one way guys i'll be showing you probably in the upcoming videos i'll be showing you more different ways uh if it now you may be asking that is it possible to directly pick up all the data sets from kaggle that all will be seeing in the upcoming videos but this one is pretty much simple so we'll download this whole data set okay once you download it once it is saved in your local okay so if i take an example over here so i had already shown you that particular example then what you have to do is that just go inside your google drive and just upload this whole data set okay so you just have to upload this whole data set you can see that i have uploaded all the data sets i hope you have worked at google drive right it is very very much simple you just have to click on new folder upload and just select that specific folder and based on the path you can actually save it so i have my drive programs inside that i have that data set and i have actually uploaded this train and test okay so this is what the data set is so here you can see that all the data set is over here inside this all the images also you'll be able to see it okay so these are all the images so this is the first step and since there are many images so it may take some amount of time you have to be patient just and it depends on your internet speed i've also taken some additional storage of google uh probably the price was very much less but i have somewhere around 100 gb which i'll be using for my personal purpose like training models and deep learning in google collab and as you know that i'm also coming up with playlist in bert so that is the reason why i've taken 100 gb again you can also take one tv 2 tv based on your requirement and it is not necessary that you have to compulsory buy but by default you will be getting 15 gb now you can see that i have i've crossed the 15 gb mark i have somewhere around 24.1 gb okay now the first step you have already done just upload this whole data set then the next thing is that just open your google collab okay so google collab now what i'm going to do the reason why i have put all this particular data set inside the google drive because i'm going to use this data set and call it inside my google collab so just go and open the google collab so here you can see that i have created one ipinb file if you have not created this just create go on creating new notebook and just click on this and a new notebook file will get created so right now i'm just going to open this collab tomato disease ipm in b this is also getting stored in the google drive itself you can also take it from github you can also upload you can also upload this ip1b file by this particular option okay so suppose if uh and definitely i will be giving you this particular file in the github so what you can do you can either load it from github or you can also load it from upload section right so this is pretty much simple now the next thing that we are going to do is that um okay this i'm just going to close just imagine that i have opened this particular file now this is the whole code that we are going to run and uh sometimes i'll just tell you what are the steps and this particular code i have also explained in different uh image classification problem statements and how do you train your model and all so don't worry about it i'll be showing you all the steps now as soon as you open this just click on this particular folder okay now when we click on this particular folder i have to connect to my google drive the reason why i have to connect to my google drive is that because i want to access this particular data sets right i have to access this particular data set so that is the reason i have to import this particular data set or i should be able to read this particular data set in my google collab itself so for that you will be seeing one icon over here which is like this okay so just click over here and it is nothing but mount drive okay this is pretty much important before uh that guys i'll let me and you have to also do this you have to change your runtime to gpu okay initially it will be none okay so make sure that you change the runtime to gpu so if i go and see my runtime it is already gpu so i am not going to change it but you just check what your notebook setting it is whether it is in gpu or tpu or none but make sure that you make it in gpu okay once you do that now the next option is that you have to click on mount drive now this mount drive will be asking the permission to connect to your google drive so that basically means you need to be in sync with your google basically you need to login in the google inside the browser so i'm using google chrome over here so once i connect to google drive it will take some time you'll be able to see that now there will be a folder which is called as drive and this also file uh folder is there and these are the basic data set that is already present for practice purpose now you can see that there is a drive folder that is coming and now if i open the drive folder inside my drive you will be able to see the path these are some of my personal documents so ignore that so inside programs you will be able to see data sets okay so here you will be able to see data sets now what you can do is that if you really want to find out the path of the data sets inside this data set you have test and train so if you really want to know the path just go right click and just copy path okay so when you do right click and just copy path you'll be getting the path of the data set then you can write slash train and slash test uh to read the images so these are having all the images over here train data set and these are having the test data set so i'm just going to keep it over here now let me just close this and the first step that you really have to do is that i will go and execute this slash nvidia smi so this will actually help you to understand like which gpu has been allocated so here uh it is written tesla t4 so just try to execute once again okay and there are options sometime you may get tesla t4 sometime you will be getting some better gpu so if you really want to change the gpu sometimes you can also do a factory reset factory reset run time and after that probably you'll be again getting you may get another gpu but most of the time you will be getting tesla t4 even uh in one of my video where i had actually explained you about rapids ai so there also uh the requirement is that you need to get tesla t4 okay or some other higher gpu other than that it will not work so i'm just going to click connect on reconnecting so let me see whether my drive is there or not again it will be coming over here uh it is initializing it's saying unable to connect but don't worry it will be mounting the google drive automatically as soon as it is connected and it may take it may take some time definitely so yes it is connected my google drive is here i'm just going to execute this again and let's see whether my gpu will change or not so now i have got tesla k80 okay so uh it depends uh just try to find out that which gpu is better for you whether a tesla t4 or tesla k80 i will just execute it again so right now i'm getting tesla k80 suppose i really again want to change the run time sorry not run time suppose i could do factory reset time and if you don't know about tesla tesla k80 so if you can write tesla k80 versus tesla uh i think it was t4 right or p4 which one is better you can see this inside this url and uh select those which is better okay suppose if tesla kt i think tesla k8 is not that good when compared to tesla p4 so again i have done this particular step okay so again i'm going to reconnect and then automatically i think again we may get tesla uh p4 let's see i think tesla k80 i don't know let's see tesla k80 okay tesla k80 uh okay the best way to find out which is performing better just find out the price that's like 80 uh tesla it is rolling out at 5 000 and probably tesla p4 right p4 cost so okay tesla key 80 24 gb somewhere on 32 000 inr okay and this cost is somewhere around let's see how much it is i don't know let's see or tesla t tttt but i think tesla k80 will not perform that better so this is connected let me just execute once again and see which gpu now i've got tesla t4 sorry i was selecting at the p4 so let me select t4 okay so this is somewhere around two thousand one nine nine dollars which is better than at least a tesla k80 that is 24 gb that is 32 000 but i think uh we should we will go ahead with this okay so once i execute this uh the first thing that you have to do is that install the tensorflow gpu so for that you can write uh some exclamation pip install tensorflow dash gpu sorry it was tesla t4 i i used tesla p4 right i don't know okay so tesla k80 vs tesla t4 uh you may also get some other gpu allocated uh it is upon your luck and basically upon the usage that people are using but every time if you do a factory reset runtime some other gpu will get allocated to you okay so probably it says requirement already satisfied because i had done already everything so installing collected package tensorflow gpus and remember guys the tensorflow version that we'll be using will be greater than so here you can see that it is having 2.3.0 the most recent one so this has got executed perfectly fine guys so if you really want to see the version of the tensorflow so here it is 2.3.0 now one more thing that i really wanted to show this is basically your ram okay so the ram overall size i think it is 12.72 gb uh the hard disk size is somewhere around max is somewhere around 68.40 gb and i think 31.65 is getting used right now let's see okay so after this i'm going to execute this uh this is all my installation i mean all the libraries that i want to import and in this case i'm going to use inception v3 uh after this this is pretty much important guys the training and the valid path right as i told you this training and the valid part so suppose i will say that this is my programs okay my data set my train okay my train if i want to give this path i'll just copy this path over here and here if i paste it you'll be able to see that this whole path is available for the test right so i will just be using quotes and this path will get selected this is how you have to select your path okay so this is done and similarly you find out the train path and test path just by clicking right click and you'll get the path after this this is the path of your data set so please make sure that you have this is correctly otherwise if you're running in your local you just have to give the path and this is a transfer learning technique guys here i'm actually going to use inception v3 so let me just import inception v3 over here sorry this should be inception v3 inception v3 not uh oh so here you can see that downloading data from this this and this has got perfectly executed then the next step what i'm doing is that i'm making all the layers as false okay and this i have explained in my previous deep learning videos also so for layering inception dot layers layer dot trainable is equal to false because i'm using a transfer learning technique i want to use all the weights and remember guys this include underscore top is equal to false this basically says that i'm i don't want to use the first and the last layer so because of that i'm actually using this so once i execute this this is done i really want to get how many output classes i need to have and that is basically i'll go and check inside my train folder to see that how many classes are there so this is basically my folder size so i'll be getting all the folders so definitely in this case this is my folder and length of all this folder will basically be the uh number of classes okay then over here i am flattening or the output of the inception that i have downloaded over here because i will not be having the last layer after that i will be adding my custom custom number of this length of folders basically says that how many is the output node that you want right and for that i'm using this dense dense i have imported it over here right dense and flattened so that is done so and yes i have to use the activation as soft max remember if you just have a binary categories you can use if you have two categories in the output you can use sigmoid if you have multiple categories you should go with soft max okay so this is done and here you can see that i'm creating a model object with inputs as inception.input and the output as this particular prediction that i have and in this prediction i have 10 nodes as in my output and the activation function is soft max along with concatenating with the flattened layer of the previous inception so once i execute this this is gone and model.summary so this is how it is uh you have to focus on the last layer guys last layer is basically 15 layers okay so probably i think inside the train we have 15 layers okay and in the first layer also you have to actually see okay inception v3 just go and check the architecture guys how it is but it is a very very big uh layer so in inception in the input layer you can see 224 comma 224 comma 3. so if i go and search for inception v3 architecture so if just let me know if you really want to understand all this particular architecture guys i can also create a video so this is how an inception v3 architecture looks like module a model b and model c with great size reduction and apart from that you will also be seeing uh no that is in residual sorry so this is how the architecture looks like and what all things are used the convolutional average pool max pooling concat dropout fully connected and soft max so this is all our things are actually used inside that uh let's let me create a video on all the transfer learning techniques uh that will be also very very helpful for you because you'll be understanding the architecture okay okay perfect uh because anyhow apart from that you just have to use the trained weights from keras right i think that will be pretty much easy okay so we have done this uh the next step is basically compile and here we are going to use category cross entropy along with adam and the metrics is accuracy i think this is pretty much common for everyone now this is very very important guys for the training data set i'm applying rescaling options shear range zoom range these are for the data augmentation make sure that you have to scale down it is pretty much important for the test data set we don't have to do all these things because there we will just be doing scaling options okay so scaling option is done over here you can see don't apply data augmentation techniques in test data set only apply the scaling one okay but in training data set you have to apply all these things so for that this whole uh library is basically image data generator so i'll be executing this and now i will be reading all my trained data from flow from directory from this particular path that is content drive my drive programs data set is slash train the target size is 224 224 batch size is 32 class mode is categorical make sure that you provide the best size is 16 because i think the ram uh it is less over here when compared to the ram that i had in my workstation it was somewhere around 16 gb so i'll also make this as 60 okay but remaining all things are good to go okay i'll be executing this now you can see that they are somewhere around 18540 images and they are somewhere around 1869 images in the test set then uh probably this will take time i'm not saying that this will take two to three days but at least half to one hour it will take in my workstation this whole problem took me somewhere around 25 minutes for doing 10 epochs so uh imagine with respect to that case in my workstation i have 64 gb ram i have titan rtx which is 24 gb of vram and uh all the processing speed is super over here so let's see how much time this will take just by understanding one epoch i think you'll be able to find it out yes so instruction for updating the training has started okay the training has started the epoch first has actually started you can see that i have given uh epoch such just to give you an idea about it okay so now you can train your model this model training will happen and for the code i have to still write it down the further code will basically be to check uh how is the how is the training accuracy and validation accuracy by seeing a graph and then probably we can do this i think this all kind of explanation i've done in my live project playlist where i've discussed about deep learning projects itself so this will take some time uh the estimated time is taking somewhere around 1 hour 27 minutes now you can see that the ram size will increase or not because the passes are taken at 16. loss is 16.46 accuracy is 1.1250 let's see how much time now when i took the 16 as a batch size why i'm getting 1159 that is basically the whole number of images this 18 five four zero divided by one six what it is i think it is this much only let me see i'll just open the calculator to so uh what are the total images that we had one eight five one eight five four zero divided by sixteen one one five eight right so here you can see that one one five nine is basically taken up because i think we got one one five eight point seven five this is basically the floor part and here you'll be seeing that uh after some time the loss will be decreasing the accuracy will be increasing and yes it will take time because it has more than eight thousand 18 000 images guys so but just try it by yourself you just have to be patient for uh around one hour just keep it like that and this particular accuracy will be running okay so let's see how much time it may take um and yes you can try it out so but again understand the overall idea was to show you how you can train deep learning model images and all because in the upcoming videos i'll be showing you bert i'll be showing you transformers i'll be showing you advanced nlp techniques and definitely since you don't have a workstation google collab will be the best option for us so that you'll also be able to learn efficiently for but for the people who have workstation i think you can just do the installation automatically all the execution will happen um and probably guys in the future i'll also be coming i'm also making a partition of linux system in my uh in my workstation so that you get an idea of how to train models in linux how to work in linux and all so here you can see that initially it was 0.18 accuracy now it is increasing to 0.26 at least for one epoch i'll show you and we'll see the accuracy let me just fast forward this thing how do i fast forward so let this run okay but it'll be pretty much amazing so this is what you can do guys with respect to all the data sets just let me know after one epoch because i have already tried it out in my local system uh let me just show my local system uh activate my e and b1 oops my env and i'm going to open the jupiter okay let me just cd deep learning okay then i think it is uh jupiter notebook i'll open the jupiter notebook over here i had created an environment separate environment for this so this was nyc which i was doing for the members and it is this was the tomato disease yes plant villages so here if you see with respect to inception v3 when i did the training in my local you know it was taking every every epoch was somewhere taking around uh and here i had also increased the batch size i think i made the best price to 32 okay so here it was saying that it was taking 481 seconds 163 seconds every epoch but the final accuracy just after the first epoch only i was able to get somewhere around 71 and the validation accuracy was somewhere around 77 okay but here at the end of the day you will be able to see that the training accuracy was 0.90 and your validation accuracy was 0.9005 so i will be putting down this whole i'll be giving this whole code to you all uh what you have to do is that just change this particular path based on the path of your google drive wherever you are uploading okay till then i think uh this will take some time now you can see that my loss has decreased from 16 to 10 and my accuracy is also increasing but in short it will be looking something like this only make sure that when you are training in the google collab make sure that you make this batch size at 60 that is it okay and just try to execute this so i hope uh you understood this i hope uh you found this video very very much useful uh please do subscribe the channel if you're not already subscribe i'll see you in the next video have a great day thank you and all bye bye cheers to google collab cheers to google thank you
Info
Channel: Krish Naik
Views: 97,269
Rating: 4.891892 out of 5
Keywords: data science, machine learning, deep learning, computer vision, google colab
Id: chQNuV9B-Rw
Channel Id: undefined
Length: 24min 25sec (1465 seconds)
Published: Sun Sep 06 2020
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.