Bengali.AI: Handwritten Grapheme Classification Using PyTorch (Part-2)

Video Statistics and Information

Video

Captions Word Cloud

Reddit Comments

Captions

so too begin with the models we will create some model files okay let's see I know this one okay so I'm gonna create file called models for PI and another file called model dispatcher so I have models and all dispatcher dot PI and we can define our model models in model strata so it's a fly torch Python library and now it has a lot of pre training models so we are also not going to do it from scratch obviously we're just going to use one of the pre trained model so you install it using pre trained models the pencil pip install pre train models so what you have here is as you can see you import the preaching models and you have a lot of different pre train models and trained on mostly imagenet data set so you can just load it like this so you have a model name and you get model like this and you specify what you want to be loaded with what kind of pre trained wait so most of them are imagenet and you can also pass none if you don't want it to be with pre-trained wait and you can just do eval and then it gives you the classes so here you see that numbers is 1000 in our case it's a multi-label classification problem so let's look let's look at some of the models here so i will start with the small model like ResNet 34 so I'm looking for ResNet and or we can we can just look at any kind of model so we can look at risk next and so here we have risk next let's see if we can find okay it's probably in torch vision models so here we have here we have ResNet 34 and [Music] here so it loads model dot resna 34 preaching false and then if you have the pre trained imagenet then it downloads and load you know wait for it so models taught resonant okay let's see if we can find it inception v3 that's not Alex not I have just trying to find the definition of it whereas night 34 which models of unfortunate summer okay it starts region models yeah I think that's fine so you have this function modify rest nets so what's it doing here it's changing the last linear layer so last linear instead of FC which is used in dart fusion and you also have this function called features that generates the features and then forward function changes a bit so what we need to do is we need to do something similar to this one and implementing it is quite easy if you use pre train models so let's let's take a look how we can implement this model arrest net 34 for our specific problem so you import pre-trained models and you define a new class so let's call it ResNet 34 yeah why not which inputs from torch dot n n and here you have an N dot not you so just like how every model is defined and by touch in it and South comma let's leave it itself and you need to now say it's a superb and inherits from south press net 34 comma self dot underscore school in it so whenever you define torch models and you never from this module so now we have to have a model a kind of a base model and we say okay pre train models dot ticked rests net 34 and it also takes a parameter called pre-trained right so I'm gonna say okay if pre-trained is true then pre-trained equal to MH net otherwise we don't load the pre-trained wait let me need for model testing else pre-trained is none okay so you have that and we can also check this so let's try to print it let me open a terminal here okay so I'll are I can just use Jupiter notebook for this purpose so I'm gonna rename this to dataset view and gonna create a new one where I am for pre-trained models I say my model is this one so let's look at the model itself so it has a 2d convolution layer bash norm it's a big model and you guys know about it so I'm basically interested in the final part so in the last linear the final layer if you start fusion then you have FC if you use bridging models and you have last linear you have input features 512 and output features 1000 so we need to change that so to change that we go back here and define a few extra layers 0 layer 0 let's say and then dot linear layer which has 512 input and 168 output and similarly we have L 1 and L 2 and these will contain 11 and 7 so now why 168 11 and 7 because there are 168 graphemes and 11 wobbled and four seven four consonants and then we can write a foreword function for this def forward so of gamma-x that takes a batch and you have that size and then channels you don't need height and varied you don't need and that's your textured shape and X becomes equal to self dot model dot features so we saw this there is a features function so we just extract the features and then we apply adaptive average pooling so from tour start an import functional as F and here we have not adaptive average pooling in two dimensions and that will take X the features and output size is 1 and you reshape it to bad size minus 1 and once you have that you can say l0 is self dot l l0 X and similarly on 1 & 2 so l1 and you returned l0 l1 and l2 okay so that's your model which you're missing something you're missing pre-trained okay so this is argument use here and this is mmm this is not not very complicated if you if you look at notebooks and discussions in this challenge you will find that a lot of people have implemented the same way I was implementing it a bit differently so what I was doing was to take the soul rest net class and just modify features and forward here you can also do that but you see think this is a much simpler way to implement the model and the thing is yeah my one of my teammates from this competition he told me like okay this is much better way so I'll use this way instead of the other one so this supports all kinds of amazed eyes and next thing that we need to do now is go to our model dispatcher and here I same four models and I create a dictionary that dispatches the model and why it's useful I will come to that later so I say okay ResNet 34 and that's models dot resna 34 okay so we are done with this now we can also check if this works so I'm just gonna go back to my notebook import so I'm gonna say okay take this part from here editor from model dispatcher and pull model dispatcher we have that and I say model is model this batcher rests net 30 for pre-trained equal to false and it got a unexpected qru and okayed so let's see okay yeah sure come here and now we need to restart the car no so let's try it again okay something is wrong of course I changed my model lane in the class names on it fix that okay okay cool so we are here now this is our estimate 34 for this specific problem and you see like there is l0 l1 l2 layers after the last linear and you have 512 512 final trial and output 168 11 and 7 so you are good to go to train so now we can finally create our training file called trained or PI and we can start coding the training and validation loops which is also quite easy quite simple if you have done it few times you will know it's very easy so a few things that I need to define that I want to define all the time when I start with the training file is what kind of device I'm using so where is my training Folds CSV file so what I'm going to do here is get these values from environment variables so I will import oh s other also import ast will tell you why and image height is also from environment variable same for image width and same for a number of e-books I want same or bad size for training bad size for testing and then I also want model mean model standard deviation training folds validation fools and the model that I want to use so I'm just a bit lazy with so this becomes my image width and then this one was we can we can call it a box so that's my box so now I have training batch size and test batch size and we can do model mean so and models STD and this won't be int so here I use literal eval so what it does is like your environment variables are strings and this is going to be a string of tuples or lists so it's just going to convert it to from stringy fight let's do a normal list that's what it does and then I need training foals and validation fools and also on base model and the same thing goes for training and validation fools I have to use a steed or literally valid state event it's not urgent and here so we have completed the first part where we have to find everything that we want the parameters to train them all and now we can say okay we can start with writing the training and validation loop so you can start with the main function let's call it just main and what it does is it's the main function that's called when you call the script so I say model is from models import sorry from model dispatcher import model dispatcher so I have that and I can just write here okay model is model dispatcher your base model and free train equal to true because this is a training loop model taught to device okay so we got this now we need training and validation data sets right so train data set is another room for from data set import golly data set train so train data set it will be regulated as a train and here you have folds so which are your training fools image height is image height the image height that you wanted after resize image width and you have mean image mean where is it image so what did I call it model mean sorry so model mean and integration equal to model a steely so we have all these things here and now we create a data loader for training data set train loader is now I need to import torch torch dot util dot data data loader where's your data loader object and here we want data set so that's your train data set and that size is train bat size then you have shuffle so for training or even for validation you can use shuffle doesn't matter but don't use it for testing obviously why is it getting converted to total true okay and numb workers so I want I want for workers not many and I'm gonna copy whole thing here and do the same for validation set so I'm gonna say okay valid data set so this is now incorrect so this is just using chain oh damn I have shouldn't have done that okay Norris we have to do it manually so validate of it and then volts will be validation fools everything else remains the same and here you change it to valid loader and this will be valid data set and validation at sighs what did I call it to a valid back sighs sorry test batch size so I'll just do test batch size shuffle true/false doesn't matter because the data set object is returning the labels even for validation so it doesn't matter but I'm just gonna set it to false the next thing you want to have is an optimizer optimizer and that's your I'm just going to use atom optimizer so what I'm going to do is I'm gonna trade on all parameters and learning rate set it to one knee -4 so I'm using all the parameters but you can also experiment with that and you can also set a differential learning rate for different layers you also probably need a scheduler and when you use scheduler from tours you have to remember some schedulers you need to step after every batch some after every block so that's very important keep that in mind so when I want scheduler I also have to import a large scheduler or I can just do torch not optim not al our scheduler dot so now what what do I want so I want to reduce a large plateau so that's let's say when it's splattering my model scores then I reduce the learning rate so here this takes optimizer as the input so mode now mode can be Max or min if you if you want to like go for loss then it should be min I'm putting it to max because we will see the recall score and based on that we will say okay reduce the learning rate patience 5 & a factor reduce it by a factor of 0.3 I think that's good enough so we have this and let's set verbose to true okay great so we got that now time comes for training the model but before that there is one more thing if you have sorry if you have multiple GPUs in your system you can convert your model to data parallel and that's gonna make stuff fast so I'm just gonna say if torch dot CUDA the device count is greater than where did I go tortured Kudarat device count it's greater than one then my model is a nun dot data-parallel okay I did not import this so import torch dot and then has an and and then dot data parallel let's see yeah okay great now so you have that and now there's a lot of different things that you want to do here I'm not going to show everything you can also have a early stopping so I'm not gonna implement early stopping in this then in this video but you can easily do that early stopping then hi Raj and it's quite simple and easy let's say this one yeah okay so you can use this this early stopping class so if you go here yeah so you can use this class yeah and I'm also using a modification of this all the time so but we are not going to do it in this video I'll probably make another video for only is early stopping and how you can create a better early stopping so now we are ready to train our model so I'm gonna say log is now let's not say that so we can just have a train function and a validation score comes from evaluate function and scheduler now the reduce a LaRon plat to should go here step well score okay and you can also see if the model here if you want so let's see and then you also need if main then run the main function right now we can now we have to write train and evaluate loops so we let's also save the model here or not save a lot state ticked and name of your model so f string okay no not this one and here you can say base model underscore validation falls zero because it's always a list all of them are less dot I don't know on because that's five whatever that's it not been yeah let's do that so you save with base model name and fold so can also do underscore fold one four two four three five zero and now we have to implement the training loop interesting part shouldn't take much time and there's also a lot of cool things that you can do there but we are not going to do a lot of cool things in this video so you have training loop now what do you expect from training rope you get a dataset and data loader and you want the model you want optimizer you want you don't want scheduler because sometimes you want scheduler if you want if you want to use a different scheduler probably you want the scheduler here not after the loop and I think that's it that's all you need so now I can do for first of all I have to set the model in train mode so now I can do for batch index comma D data set in enumerate and here I can have data loader I can also add TQ DM here to see the progress and total number of elements here will be Len of data set divided by the data loader thought that size and you need to check in the widest range complain so you have that now we extract all the information from the data set that we created earlier so we have we have image we have trophy we have oval we have consonant right so image grapheme underscore root and then you have vowel diacritics and you have consonant diacritic right so you got everything and now we have to put it to our cuda device so you can say okay image is mah thought to device and I just type here should be the same as before D type is Josh thought float:right and named for others so now your type is long for all these and one graphing route and the same ones so just copy paste okay cool we got that now the next step is to do nothing but just to train the model so your outputs are models image right and one thing you have to make sure like when you go to your models so your first output is grapheme then it's wobbled and that's consonant so let me try to maintain the order otherwise it's going to be difficult so I just gonna give you some wrong loss so you got the output and then your loss is some loss function right let's do one thing here to keep it simple I'll just make it min okay and okay so loss function and last function should take your output and targets now we don't have target so we make targets let's grapheme root over and consonant order is important now what we can do is we can do law start backward and we can also step the optimizer and we have two zero craft optimization so we should have done that before the model so optimizer dot zero grad okay so we are done here is your loss loss FN not implemented yet and that's all you need you don't want to return anything and it's done it's done and now you want to do the same thing oh not so much same thing for evaluate so I'm just going to copy paste and this will be evaluate now evaluate doesn't expect optimizer and model is set to eval mode everything else remains the same except we don't need to do this we don't we can we have to do this we don't need to do this we don't need to do this so I'll say final loss is equal to zero and I can just counter go to counter plus one and counter is zero and here I can do okay final loss plus equal to loss and return final loss / counter so that's the mean loss average loss so I've done it in a very simple way it's better if you use the actual metric of this computation after and and the last two so that will give you a much better view of what's happening with your model so now we write the loss function and loss function is also quite easy so we are not going to do anything very complicated here so we have output we have targets and I'm gonna say 100 203 is my output the three outputs grapheme vowel consonant and same four targets t1 t2 t3 that's my targets right and then loss one is just gonna calculate cross-entropy loss so one oh ma t1 that's your l1 and you just say okay I'm gonna copy piece two and three and here you have three you have two and then you return L 1 plus L 2 plus L 3 divided by 3 average sauce so you can also do a grated averaging of these losses probably that gives you better results but I wouldn't know and got outputs and targets everything looks good to me here so we can probably try to trade our model and see that we have a lot of errors so let's see but to train the model I'm gonna create a another script a bash script run dot as such so here in this script I'm going to define certain things that are needed to train the model and why I do that I mean the thing is it becomes easier you know when you want to train multiple models it's very easy so I have good visible devices that says okay how many devices are visible so I have two GPUs so I'm using zero comma one if you even if you have multiple GPUs you want to use only a few of them or some selected ones you can just define here and export image height that's 137 and then you have export each word so all those things that you defined in - sorry 236 in your training script so explore the POC it's equal to I say 50 no then you have export train batch size so let's say 64 I don't know if it's going to fit but let's start with that and then you have test back sighs now you can put something small export model mean and that's now a string and export models the deviation that's also string so we'll come to that and now you're ready to train the model so before draining the model you also need to specify what folds you want to train on so you can say export training fools there's zero one two and three and explore validation folds call true just four so if your folds are zero one two three training or validation it should be four and then you do fighter and trained up by and now you copy piece two three four five so you don't have to come back to your script and run it several times you can just do it from here so zero one two three then it's four and this is three then it's 0 1 2 4 is 3 and then 0 1 4 3 2 and 0 4 2 3 & 1 & 4 1 2 3 & 0 so it's going to chain all the different folds and I'm going to save the model for you and if we are missing these values so I'm just gonna copy paste from here so these were the means and these were standard deviations okay so now we have to hope that it works so let's start a terminal here okay and we do a search for under that such can't open run dot SH Y okay because it's an SRC run so let's let's see if it rains no it doesn't it doesn't because of a reason we didn't define based model so here define base model to rest snapped 34 because that's the only model we have implemented till now so you define that and let's try running it again let's see what happens now we did another mistake in train dot pi when we were defining the model not everything is end so it's like this let's let's just recheck everything else is okay we didn't define this one so I'm gonna say export training pool CSV is input / drain hole start CSV the file that we created earlier and now run it again okay so let's see what's happening in the end okay so yeah it should be named not in it and cheer run so now it's downloading the free train model so that happens only once if you already have it in cash it's going to use that so you know in it missing one required positional argument module so it seems like data parallel didn't work and it's because we didn't supply it with an argument so we want models okay run a search train Oh obviously so we created a train and evaluate but we have to we have to supply the arguments right so data set data loader model optimizer so train data set train little order model optimizer okay and for same for evaluate so valid data set valid data loader and model okay I really hope this works this time chickadee em chickadee mm is not defined I should I should be using better ID from chickadee mm poor to get him okay so now it went to the model part and gave me an error same device device one does not equal zero let me get rid of data-parallel so I'm just gonna say okay zero and this should work yeah okay so this is working fine seems like it will take some time to Train and we can also see that how much memory is being used 2485 which is quite low so I'm using only 3.5 GB no worries I'm gonna stop it and I'm going to increase the training bed size to 128 I think it can fit and then I run it again so it's working fine and now getting the same amount of memory I'm not a lot let's see I can fit more it made shy to manage mean run train bed size 256 resna 34 and okay so now we have 200k no I was looking at it wrong so I have to check it again so now I'm using 7.8 cakes rate kicks of memory almost so yeah that looks fine should complete in five minutes so the next part is now writing a kernel for the inference so we need to wait for model and let's look at it in the next part how to create the inference colonel but before moving to the next part I want to explain why I did it this way so the reason is pretty simple you have the model mean and standard deviation the name of the model and all the parameters that you can adjust in one single script and all you have to do now is to create different model definitions so you can just say ok I want ResNet 50 so blah blah blah model start ResNet 50 whatever and then you go to models file and implement ResNet 50 here or you can you can just choose any model you want to implement so it becomes very simple and easy to train multiple models just by changing one line of code in run door as such and that's just pretty useful and this is like this is doing all the training for you and if you're familiar with cackle API so you can also like in the end you can move the files to a different folders the bin files that are created create a gaggle dataset create and create a dataset and you can just automate from model training to uploading your models on Kendall datasets once that's done the next step is is obviously manual so you have to go and create a kernel and you use that data set to generate predictions on the test set so we are going to look at it in the next one next part so for inference so now you have to go to Kegel to this competition so till now I assume that you have trained your models it's going to take a while to train the model by the way so I'm gonna choose GPU here and create a fightin notebook so let's let's do that now we need a lot of things here so first of all you need we are using pre trained models so pre train models is not installed import free trained models not installed so you need to you need to add it to your kernel so I have shared pre-trained models and you can just add this one so let's add and then you wait okay it's done so now we can import this and pre-trained models is dot slash input slash featuring models / three chain models by touch master and now you do sis not part dot pincer so this just adds to your pattern pot so now it will be able to find retrain models so pre-trained models and yeah it will work now you need some other imports so input glob import torch and report Beeman tations import pandas so it's just better to import everything beforehand for numpy is an NP so you need t QD m for the cool progress bar security m and you need from PR LM poor image import job lib import job we probably don't okay I'm doing it again I think that's good enough from tor shot and input torch gotten in Massillon and from torch tottenham import functional as F okay then we define a few things like test mat size you need to make sure you don't use a very large batch size if it doesn't fit in memory then you will have problems so the model will crash you will not know why so better use a smaller batch size or make it as big as possible till it doesn't fail you need model mean model standard deviation so I just copied it from there and we need image height now it should be the same as what you use during training so 137 m EG width is 236 and device is cuda okay so create it all the inputs and we have all the variables and now we need we need our model so there are there is a much better way of doing it I have explained it in another video how you can create a much cleaner how you can create a much cleaner file for inferencing but here I'm just gonna do it fast so I copy the model and I paste it here so you don't need to do anything else right and then you say okay my model is or do you need anything else yes of course you need the data set so go to data set so here is your training data set class so copy it and paste it here and now a lot of things are not available obviously so you don't have that so here also like you you have to do it in a different way so you cannot choose job loo look at that you don't need this that's always just we don't have these values we don't need folds but instead of folds we need a data frame image height width means and deviation is fine so you're you don't need this you don't need this it's it's quite custom so self load image IDs is DF dot image ID dot values so what's happening here is we are loading the whole data frame we're passing in the data frame and this van gaal it it has a test class and self-taught IMG array I will say so now this is my full values so it's loading quite a lot of images in memory that's actually faster for us so we had we have the height image height image with mean standard deviation that's fine so this remains the same the length and then you need to change this part so image will be self dot IMG array and item comma oh ok and everything else looks fine let's also pass an image ID self dot image IDs item and let's just add it here image ID the signs yet so you got the data set class a little bit modifications but not so much and now you can start generating your model predictions so you do model taught model is you load it the same way so I can just take this one it was resting at 34 and rest at 30 for pre-trained equal to false you don't want a pre trained model and then if you if you have used model or data parallel you have to do the data parallel thing again it doesn't matter but you have to do that otherwise you have to change a lot of things and it's going to throw an error so it's just better if you just do model load data parallel before loading the state ticked so load state ticked and then you have torch not load and path to the model so single model so daughter / input so you have to create a new data set with all the pools in it and you can sail so here I'm showing like okay you have created a data set bring all the models and you have full wrath snap 34 underscore full zero dot then and you can load it like this and then you put it in eval mode so once that's done you can now start generating predictions so to generate the predictions you have to do okay for there are four different files for test and what you have in public is it's very small but you have the same format in private and your code is going to run on private so you say okay for file id X in range for 0 1 2 3 you read the file PD dot 3 RK and this is going to be your test file so this file is located inside this folder and test image data 0 so dot dot slash input slash Bengali a I see v19 that's the name of the folder and test image data underscore file id X dot Parque okay so you've read the data now you create a data set so data set is an ollie data set test and here you have DF equal to DF j is equal to mg score height G words with mean go to model score mean mmm stevie is smarter than the score okay so we got all that now we create the data loader so data loader will be torched ought you to store data dot data loader and then you pass in data set batch size is just that size then you have shuffle to false very important and num work curse you can also have the shuffle to true because we are loading the IDs but it's just better if you keep it to false so now we need to load the image and get the predictions so for bats and X comma T in data set a my right data set your image is d image and actually ID is d MH ID and now you need to convert it to you need to put it in your cuda device so image will be image dot - device to be type equal to torch dot float mmm okay and your grapheme / consonant is model image now you have that so now what you want to do is you have to do some kind of Arc Max on these so your G will be NP dot R max so it always starts from 0 so you have to choose a class in this specific competition you have to predict the class not the probabilities so we got GV C okay so we have the image ID we have G V and C so what what we can do is we can create lists here corrections is a list and we can do predictions dot append for I will write a small loop for this for I in range for I a comma I am ID image ID and enumerate I'm GID because it's also the list now dot append I am ID come on g-iii we it this can be done it much better way CIA but I'm just being lazy right now and so you got the predictions and then in the next step you say okay what is your submission file on that that's all you should love to do PD data frame your predictions columns will be row ID that's the format of our predictions and target okay so I did something wrong here okay your predictions dot append so let's take a look at the sample summation Vegas I don't remember okay so you have test zero consonant zero grapheme just zero okay so you have to do you have to do it a little bit differently so you have to say okay my I am Aidid image ID underscore grapheme root comma G of Pi now you have to do it for all of them so a wobble the critic so that's V and C will be consonant okay now you have everything in proper format so row ID and target and then you do sub to CSV does sorry submission let CSV name is important and indexed to false so one of the steps that I have skipped here is uploading these models so you can just go to add data and go to upload and select which files you want to upload and you can name the data set here and everything will be fine so let's select all the dot bin files and put them here and once that's done commit won't take much long long time it's a very small test set for public but for private it's going to take a while so if I think I can take a while depending on how many models you have one more thing that I have skipped in this inference file is to use all the fools so you need to modify this and put it inside number of fools and run the same thing again and again you also need to select model so instead of 0 or 1 10 2 3 4 when you have all of that you just take all the probabilities for Givi and see from all the folds and you average them and then do this after that so I'm leave that as an exercise but yeah it's pretty cool competition and if you follow what I've shown you here you might as well get a very good score so getting something like 96-97 0.96 97-98 between 96 and 98 I would say so you can get that kind of score you need to think of a few things so I'm not giving away all the secret sauce not giving away any secret sauce at all need to see the different kinds of model you can using this coach it's very easy to train different kinds of model and you can also ensemble different kinds of model and with that I would like to stop and I think that's that's all for this competition and if you have any doubts questions if I've missed something then please let me know and let me know how I can improve and see you next time thank you very much

Info

Channel: Abhishek Thakur

Views: 6,474

Rating: 4.9593906 out of 5

Keywords: machine learning, deep learning, artificial intelligence, kaggle, abhishek thakur, classification, pytorch

Id: uZalt-weQMM

Channel Id: undefined

Length: 73min 7sec (4387 seconds)

Published: Sat Feb 22 2020