Train custom object detection model with YOLO V5

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
hello everyone how are you guys I don't usually do live coding because a lot of things can go wrong and they have in the past so I don't usually do that but let's do it this time today I'm going to show you how you can train your own yellow model for object detection yeah many things can go wrong let's see so I'm not going into details of what Yolo is some of you already know if you don't know go and read the paper and maybe I will go into details of the paper in one another video in future but not in this one so right now the latest yellow Yolo version is v4 and there's a lot of controversy going on if v5 is the latest version so you can also take a look at this issue is v5 a scam so what we are going to do is we are going to train the yellow model so soory I'm not going to use tensorflow so I'm done then I'm just going to use yeah it might be fake my torch is being used so what I'm going to do is so you know this repository all from ultra lytx Yolo v5 so I don't know if it's fake or real but whatever it it works and it gets our job done so that's all I want to do now right now so I'm going to use this repository so the first step would be to clone this repo so you can go click here and clone this repository by using git clone and now we need a problem so what kind of problem should we tackle so let's say we want to tackle ongoing casual competition called global wheat detection and you should not use Yolo for this competition because you a license of GPL and this computation requires you to have a license of MIT requires you MIT license so you won't be able to win anything and it's not allowed so do not use that but I'm just using it for the data set and just for this video as a tutorial and I can show you like how good it is so what do we have in this data set is you have different like pictures of this crop and you have two found fine fine bounding boxes so let's look at some of the pictures from test dataset so awesome the pictures are not loading okay so these are some of like this is like one of one sample picture so what you have to do is you have to crop the crop images so you have to draw a bounding boxes around these crop images so that's all you have to do and to do that so if you look at ultra lytx Yolo V file they provide you with a training file so that's what we are going to use maybe you in one of the future videos I'm going to show you how to do it using Python and without this repo so that's what we went with this is what we are going to use right now and Yolo expects data in a certain format so that is very important and that's the only thing that you have to care about so when you download this data you see like you have a train dot CSV file in trend or CSV you have one image ID and you have four refer every image ID which can be repeated multiple times you have a bounding box which is the value of X Y width and height if I'm not wrong and you can also check in the description so X Y width and height okay that's fine and you are provided with width and height of the image so one image can have multiple bounding boxes yeah because like look at the image so you have multiple bounding boxes for one image and that's what you have to predict so to to convert this data to yellow format first of all what you need to do is download the data so I have already downloaded the data and let's see how it looks like so if I do like trained our CSV you see like the first column is the image ID then you have image width in height then you have a bounding box which is a string and then you have source so I'm for like this video I'm just going to ignore the source so you might you might need source for building a proper cross validation system but for this video I'm just going to ignore it so now what we have to do is we have to just go and code a little bit so I'm I have cloned the repository so here is my repository and in this one I'm just going to create a file called munge data dot Park so that's what I'm going to call it and where is my input located so my input is located at this location Oh MOBA shake workspace tweet input and I have I have cloned the repository and inside that I create a munch data so you can create it anywhere you want yeah you need annotations if you are training a new model and now what I'm going to do is I'm going to write some code so first of all I will import pandas okay and now let's write the main function so if you want you can also call it along if you have the data set already I guess you don't have the data set right now so you cannot call it along but yeah anyways so what we have is now I will so object in the image Shubham here is the crops like like this one the big ones that you can see like similarly in other images that don't load for me yeah wheat heads yeah thank you so now we create now you read the data first first let's let's define a data path where is the data located so my all my data is located in my home folder workspace wheat and input so here I have everything that I have downloaded from the cattle data set okay yeah Yolo v5 is not valid or any kind of Yolo is not valid for winning this competition or even taking part at people legally so now I'm going to I'm also going to import OS and let's download let's read the data first so OS dot dot join and data part comma train dot CSV so this will read the data and now you know that my I showed you the bounding boxes here are strings so they need to be converted to lists and you can do it quite easily using a SD module so I will just import it here and what I'm going to do is I'm going to I'm going to say D F dot B box the bounding box column is D F dot B box dot apply and here you have a st dot literal eval so this will convert those stringy fide lists to a proper list and I mean this is pretty easy to see like Conda activate ml I can show you quite quickly so import ast and now let's say your list is a string so one two three four and when you do type of L this is a string but when you do a st dot literal eval and then L so it converts it to a list so you can see that L is a string but ast literally val is a list so that's that's why we use it so we have converted it to two to a list now so now what we have to do is we will group by so we will group by the image ID column and we will take the bounding box column so be box column and we will apply lists so now for every image ID I'm going to get a list of Lists okay and let's let's reset the index and drop drop the index okay so now if you want I can print this for you so let's do that friend EF and I hope this works sometimes a lot of things don't work for me and first I activate the ml environment and then I can do python munch data so it's going to read the CSV file and now see what you got is this right so since we dropped the index now instead of dropping the index I can just rename in so I can just say name is B box says morning boxes and let's try this one okay so what you see is now you have an image ID column so these are IDs of the different images that you have and you have all the bounding boxes which is chilli a list of lists so you can also convert it to an mp3 want okay but we need to do a few more things so one of the like first and foremost thing what we do is we have to create some kind of validation set so from SK learn import model selection and I'm not doing anything fancy here just a simple split so you have the F train and you have D F valid and model underscore selection dot train test split and inside that we have D F and I will say test underscore size which is the size of the validation set is 10% and random state 42 and let's also shuffle so we have all these things now okay and now we reset the index so DF train dot reset index drop true and we do the same thing for validation DF valid okay so now we got everything but still we have not power mattered our data properly and for Yolo you need to format your data properly so you need to create some folders and like you you have this data in a specific format so what I'm going to do is I'm going to create a new directory and I will call it wheat data okay and inside wheat data so I'll go to weed data and here I will create a directory called images and another directory called labels okay Yolo v5 is not compliant with competition rules yes you are right so you should not use it but it's good to learn and now inside images I'm going to create new folder called train and I'm also going to create a new folder called validation and I'm going to do the same thing inside labels inside labels mkdir crane and validation so let's see when you see the sweet data this is how it looks like you have a folder called images and inside that you have train and validation and you have four record labels and inside that you have train and validation images we data images yeah so I've said that previously you should not use this for the competition so you have train and validation inside images and inside labels you also have train and validation so you can also do it using Python I mean when you're creating these so how Yolo expects the data is for inside images you should have the JPEG images or other kinds of images or the kind of format and inside validation you should have validation images let me fix this thing and inside labels you and you should have like if you have training labels they should be of the format like you have the class then you have the X Center the Y Center width and height okay this is how the data data should be formatted and inside labels you have text files inside images you have corresponding training and validation for so they should have the same name and now we go to the lunch data dot PI again and we create a new function called processed data in a process data we have created training and validation sets and processed data will let's say there is an output path and output path is we can just see what the output path is home abhishek workspace yellow ok / wait data so home workspace in your low week data so that's your output path now inside process data process data should take a data frame which is let's call it data and the data type here data right can be trained or validation let's say by default it's train okay now we are going to loop through I trait through the different through all the rows of the data frame so we can also use T qdm just to monitor the progress so let's import that to monitor the progress and here we have the F sorry data dot title rows okay and you can say the total number of rows is the length of data so this is done and now we go through each and every row of the data so we can say image name is raw image ID so that's the name of the image and bounding box or bounding boxes is Rho V boxes okay so you got the image name you got the bounding boxes and now create a new list called yellow data which is an empty list for every row so go through each and every bounding box in bounding boxes and you get the value of x and y and width and height B box 0 and this is 1 2 3 and here you have Y width and height okay so you got everything so now X Center will be X plus width divided by 2 + y Center will be y plus height divided by 2 okay so now how yellow expects the data like these centers to be formatted is you have to divide them by width and height corresponding with tonight and in this data set it's easy for us because all images are 1 0 - 4 X 1 0 - 4 so what I'm going to do is I'm going to say X Center equal to or this divided by equal to 1 0 to 4 and Y Center this is the same and then width and height so I can say width and height okay and now you append it to yellow data so just up in everything that you have here as a list so when you append it the first thing that you have to open is the class what is the class of this bounding box in this case in this particular problem we have only one class so we append 0 the first value is 0 and then X enter Y Center and width and hide so that's what you have we have appended and now you convert yellow data to a numpy array but we also need to import numpy I think so yeah it's gonna be a tall to float so you just divide it so we divided by 1 0 2 4 because that's the size of the image so width in height is the same and that's how Yolo expects the data to be formatted and now we can save this so NP dot save text and I can use a F string and say oh s dot dot join and this is where your output will be saved so we have to save this Yolo data Yolo data is going to be your different bounding boxes with the class here class is only 0 and you have to save them to your output path so this will be output underscore path and this will be whatever you have like here you have the data type so inside we data you have images so instead of images here you have labels slash your data type which can be training or validation and then you have image name dot txt okay so this is where you save it and you are saving Yolo underscore data and I'm also like formatting it so I can format each and every column so I can set say pressing D and others are present F okay and we need to copy the image to the images folder so I can just copy using shu till dot copy file so let's import assets Yuto and you copy these images so from origin to destination obviously so you have data path so OS dot dot dot join the eternal is for path and inside this data path you have you have them inside train folder so data path comma F string train slash image underscore name dot jpg okay so we have this and we have to copy them to our output part so here output path and then instead of train you have first you have images slash then data type either train or validation and that's it so this should create a data for you let's try to run it and see if this works so I can just do Python Mohnish data okay I have an invalid syntax 9:33 okay I'm missing a comma okay so now we can again syntax problem okay so I forgot to close this bracket here okay I should I should really use some violent or something like that fifty fifty-five okay but I didn't do anything so it won't work so I have to call this process data function so when we have created the DF train and DF valid we can call the process data function using trained and valid and this data type will be validation the other one is training obviously and there we go again and let's see what happens now okay so it's working 3,000 images and 338 images so it seems like it's done we can obviously go to we data and check it so we have all the images here which is okay and if we go to validation we will have some images there too we which is fine and we go to labels and we can we can pick any so okay so like for this particular image this is the file that was created so X enter normalized Y center again all are normalized and you have this label so now if in your data set you have multiple labels so it will be 0 1 2 3 and so on and so forth right and you have to specify them a config file so that's that's what the next thing is so now you have to create a contact file so let's create a conflict file so I will call it V dot yeah mal okay so this is our this will become our config file and inside this you have to specify it a few things like where is the training data so training data is inside wheat underscore data slash images underscore train okay and you also need to specify the labels so something I think something weird is happening and it's stuck and validation data is inside wheat underscore data slash images slash validation okay and then you have to specify number of classes number of classes here is one and the class names so you can just write wheat or wheat head whatever you want okay so that's all and now you can go to train the model so let me go here to this tubular terminal and here I can show you the Yamal file so this is what the Yamal file looks like and what else do we have and inside models you will find different yellow models so you have the large model medium small so you can also like if you want you can change it if you if you want to we will just use use them as it is so you can you can play around with the model architecture here if you want but we are going to use them as it is and now you just have to run the training command and that's all actually so you can do python train dot pi and then you specify the image size here in our case it's 1 0 2 for the bat size and a box number of a box you want to train it for you specify data which is in our case we taught yama okay you specify which kind of conflict you want to use so we will just do yellow we 5 small and the name of the model so wheat model WM and your training will start if everything is fine so let's see if everything is fine so it prints a model architecture and it looks good and yeah scanning the images that also looks fine to me for now everything looks ok and seems like it has started training the model so yeah it's going to take a while so for my machine I'm using to RTX 20/20 eyes and I think it's using only one but you can see let's see how many it's using now it's using both GPUs very little so I can probably increase a bad-size and it's going to be a little bit more faster but it's taking one and a half minutes so now I have already trained one model before so I can show you how it looks like in during inference and you can also you can also do tensor board you can also start tensor board here and where your log file so is in runs log directory is runs folder so it's going to create all the models here and you can visualize it so you see like I just started this experiment but I have some experiment from before and here you can see like map score went around 0.94 and you can also take a look at the validation loss the G iou loss is decreasing so yeah it looks good I mean I didn't train it for a very long time and one more thing that you have to remember is you can use the already trained model so you can fine tune it so weights and specify the weight parameter and here you can download the weights so weights are provided in a Google Drive folder and from there you can find the link to it and my in the description box so here you can download the weights and you can provide the pre trained cocoa weights and probably your model will perform a little bit better so let let me copy the train model and show you some of the inference files so what I'm going to do here is just copy the weet weet model data weights know and you go inside weights and here you have the best weights so I'm just going to copy it here so I got the best weights from one of the previously trained models and then I can I can do inference so you just need to run the command Python detect up PI and specify source so source can be a single image or a folder or anything you want so I'm going to run it on all the test images that we have we don't have a lot of them and you specify weights and that's your part to the weights so yeah okay inference is done and now we can take a look at what has been generated so let's see if I can do it from here output and yeah okay looks good so you see like you can also specify another parameter than parameter for the threshold so you can see like this is detecting different crops crop heads and that's what we had to do and this this is a very easy way of using Yolo and I think you should do that and in one of the next videos maybe in future I'm going to show you how to do it using by touch just use the weights in Python so if you have any questions you can ask me now or you can also message me later this is where I end today's video in this tutorial and it's easy to grab the bounding boxes just need to take a look at the source code a little bit and you will be able to do that yourself and let's see if we have if I have anything else that I want to share today well not really and it seems like my vs code is also getting stuck because of huge number of miles maybe so yeah there's a lot of things that you can do to improve on this model so if a box is inside another box I think you have to see which which one you should use and this model yellow model is GPL so you cannot choose it in this competition but you can obviously use it for other purposes the model takes a little bit longer to train i have i have only trained this small model and see how good the results are right and i have not trained for a long time i've trained like i think 9 in a box but i use the pre train wait so you can use the pre train weights and the results are looking quite good how to deploy it i will talk about it it and in another video because then otherwise it's going to be quite a longer video so thank you very much for joining obviously you can use collab how can one detect the objects in a video well this is how i showed you just now can you use this model for detection of signatures well you can try it might work detect Ron to is better than Yolo but it's slow okay I'm not sure I still have to use it okay so that's where I'm going to end the video today and thank you everyone for joining and I hope you like this tutorial and if you liked it don't forget to click on the like button and do subscribe and share with your friends and I will come up with another few more videos about Yolo v5 so some of you have asked about deployment so I will talk about deployment some of you have asked about some real-time detection I mean that will also be covered in deployment video and probably try to use just by touch and see how it works so thank you very much and have fun and see you next time bye
Info
Channel: Abhishek Thakur
Views: 39,814
Rating: 4.9427609 out of 5
Keywords:
Id: NU9Xr_NYslo
Channel Id: undefined
Length: 38min 42sec (2322 seconds)
Published: Tue Jul 14 2020
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.