How to Train YOLOv8.1 on Custom Dataset with Oriented Bounding Boxes

Video Statistics and Information

Video

Captions Word Cloud

Reddit Comments

Captions

so in this video here I'm going to show you how we can train a custom Yol V8 model this is a new one Yol v8.1 where we can actually go in and do oriented bounding boxes so in the previous videos I showed you guys how we can set it up I basically went through the documentation and so on we used the code snippet from the Alo litic documentation but in this video here we're going to grab a custom data set we're going to have some cups you guys are probably familiar with that data set that I've created myself but we're going to use that data set I'm going to show you how we can use Robo flow for that how we can annotate our images and also how we can export it so right now we can just export directly so I created a custom scripts that can go in and format the annotations to the correct format with the new uliv 8.1 model for our in Bounty box so let's just jump straight into it we're going to cover how we can annotate images export them convert them to the correct format how you can train your own custom model and then we're going to run inference on a waiter stream to see the results of oured bounding boxes so we're now jump straight into roope flow this is the platform that we're going to use to annotate our images I already have an annotated data set you can go and use that directly or you can go and use your own custom data sets I'm basically just going to show you the whole throw here first of all we can go and see the results so these are like some of the test set I'm just going to zoom out a bit and run another example so you can basically just see that I've tried to train a model here on roof flow but basically you can go in create a new project on a roboff flow upload your images and then you can go in and do annotation there's one trick here with robot flow to be able to do oriented bounding boxes if you go inside projects so when we want to create oriented bounding boxes with Yol V8 for now this is how we can do it so we actually need to go in and choose instant segmentation instead of optic detection because for optic detection with Robo flow right now they don't support R in bounding boxes if you're using annotation tool that already supports RN bounding boxes you can just use that but here we actually need to do instant segmentation and then export it in another format to be able to use Robo flow so that's just one of the things right now I'm definitely sure that Ro flow is going to change this over time so right now we can just choose instant segmentation then you set up the project name what you're going to detect and so on but I have already done that so I have this cup segmentation data set we can go and take a look at some of the images so I basically just went inside the images once you have uploaded your data set to robot flow you will get inside this annotation tool you can choose new classes you can go in and annotate the images but to be able to do this with our in bounding boxes we need this polygon tool and then you can just draw these mask here around your optic that you want to do optic detection off with the new Yoli 8.1 for oriented bounding boxes we know the traditional bounding boxes here is just like a rectangle or like a square which is fixed so it's not like rotating with the object so let's say that we want to have our cup here then let's go up and take an example so right now we draw a bounding box around the cup like this but in reality if we actually like want the best detections possible then we should probably like have our bounding box in this example here rotate a bit to the left so we're fitting the object better and that is really useful in a lot of different applications and projects so just go back here again and delete this annotation but this is the tool that you can use and is really easy to work with and we can also see in just a second how we can export the data set so yeah then you'll just going and label the images you're going to use the polygon tool so you'll just draw like a polygon around your object so right now I've just done that for all the images so you just go around it like this around the edges once you have label your whole data set you're ready to go let's go back here again we have different versions we can also go in and generate a new data set so with Robo flow we can go in and do pre-processing we can specify the size of our image that we want to use so right now we're just going to use 640x 640 Al orientation we can also go and do other pre-processing options but don't play around too much with that just using robw images is often the best one to do we can also apply augmentation steps here so again if you only have like a few examples you have like 50 images or so you can go and apply data augmentation on top of your data set basically just to generate more data that varies uh bit from sample to sample so we can do flip horizontal vertical flip zoom in like we can crop out an image if you want to have like some zooming effect and this just helps our model generalize better we can also apply augmentation or like rotation and a lot of other options in here so I've just chosen these three here that is normally what I use we can hit continue and then we can create a new project specifying if you want to like 3x 4X 5x 2x our data set so right now we have just like 5x I think but I've already generated a couple of versions here if you just go inside the versions this is the model that I showed you in the start we can try it out directly with roof flow but once we have label our data set we can go inside our versions again this is still instant segmentation as you guys are familiar with but we're trying to do optic detction with oriented Bounty boxes but this is just how we will have to do it with Robo flow for now you can use whatever other like annotation labeling tool out there if the support bounding boxes like or in the bounding boxes you can probably export it directly in that format maybe you'll need to do some conversions because Yol V8 is using a specific format um and their own format for actually like training these models so right now let's go and Export our data set we can select the format that we want to choose so previously if you're doing standard updation you'll be five you be seven you'll be eight but now we actually need to go up and choose this yolo5 oriented bounding boxes and this is not really the format that we can use directly I'm going to create a custom formatting script that takes this output here from text files and convert it into the specific one that they're using from alter litics you can either download a SI folder to your computer or show the downloadable code that we can copy paste directly into our Google collab notebook so this is what you will get you will also need to do this to act like get the API key for your Robo flow account you can also use the terminal or raw URL but we just copy this and throw it directly into our Google collab notebook and we can export our data set run our training script export our model and using in our own custom Python scripts for our own applications and projects so we'll just jump straight into the Google cab notebook and walk you guys through that so first all make sure that you actually connect to your dpu you can go up to the runtime change runtime type and then you can specify if you want to use a dpu here we want to do that when we're training these uh deep learning machine learning models so let's just run this in Med SMI here to start with just to get some information about our graphics card the GPU that we're using and also just to make sure that we're act like utilizing the GPU for training so here we can see that we have the Tesla T4 we can see the number of gigabytes that we have in our dpu Ram um and all this information here so that is always pretty useful to do then we're going to import our OS here just to set up a home directory that we can use later on because we're going to do a lot of with our um directories and so on because we need to do the mapping and the conversion formatting of our data said then we're going to install yate so we're just going to P install to ltic it's going to download the newest version and that is pretty much everything that we need to install it will take care of all of it then we just need some additional ones here for displaying our images after it's done training and so on and we also need to be able to import YOLO as the architecture so let's now just start the installation then we're going to import these different modules and dependencies that we need later on Al lytics they have some different like CLI Basics that we can go and use as well so we can just call this YOLO command and then there's different tasks we have detect classify segment but now we also have oriented bounding boxes train predict validation export we're going to use these different modes throughout the notebook and you can also specify the model instead of specifying for example like SE now we can just go in and specify OPB and that is also how you can go in and use pre-trained models already for that and do inference on videos YouTube videos webcam streams images and so on I already have a video about that so definitely check that out but we're also going to do it later in this video here so definitely stay tuned for that we're going to run it on a live video stream so now we have the setup complete here as we can see now we can go down and set up our custom data set so first of all here we need to make a directory for data set we CD into it so this is how we can call command lines from the buun terminal from roof flow we're going to import rooff flow and this is just the code s that I've copy pasted from roof flow so right now I'm not showing my API key but I'm just going to copy paste that in I'm going to run this program here or like this block of code and then we're going to see the directory over here to the left and see the file structure so here we can see that is downloading Robo flow setting that up and also exporting our data set and that should be done in just a second so right now it's just setting up some um open CV stuff let's you're going to see loading rlow workspace loading the project and then it's siing uh this data set here to our um environment here in Google cab notebook so let's go inside our directory we have the data set that we created cup segmentation to and then we can see we have our test train and validation split we have the data yl file down here as we also need if we go inside each of these individual folders We have all the images so all the images raw images and then we have our annotations in this Yol V8 or ined bounding box format so start with here I'm just going to rename these folders so we're going call it images and labels I like to just do that labels and we'll need to do it for our validation okay so now we have our images and we also have our labels if we just go inside and see see one of the examples we can see it over here to the left so right now this is the format that they're using when we're exporting it for roof flow and now we need to convert it to the correct format from ultral litics that they're using to train the Yol V8 model on so we have our X1 y1 X2 Y2 and so on all the way up to um X4 and y4 so this is basically the corners of our R bounding box and then they also have the label this is not the correct format for training this alter litics model so if you just go inside the documentation now for R bounding boxes this is all the documentation that you're going to go through they have some examples they lift the different models they have Nano small medium large and extra large model how you can train it so this is what we're going to use in just a second but here we have the data set format which is really important so obb data set format can be found detailed in the data set guide so let's try to go inside that one our in the bounding boxes data sets overview so this is the supported obb data set format the YOLO obb format so here we have class index where before we had class label and this is not the correct order and then we basically just have all the corners of our bounding boxes but Yol V8 here also uses normalized coordinates to the image Dimensions so we also need to take care of that and I've just created like a a script here like a block of code where we can do all of that conversion so here we can see the different kind of like representations that we can use as well so this is how ideally it should look afterwards to be able to train these models so this DOTA V2 here is the data set that they have pre-trained their models on and they also have a script for converting it from from the DOTA data set to the YOLO r& bounding box format so this is a custom format from altic so we need to create our own script that can convert and format our files even though you're using another annotation or labeling tool you most likely need to go in and Export it and actually like write your own custom scripts to go in and do this formatting of your data again if you're using the D DOTA format you can just go in and call this function directly but yeah we're going to write a custom script that can actually go in and create this form format instead of the one that I just showed you in a second before so here we see we have this format let's just close that and let's go down and take a look at the script that we have done so first of all here we can't really have any spaces so let's make sure that we don't have any spaces in our data so right now I'm just going to go inside our labels so again we saw in here with our labels we like have this Dash here so we need to make sure that we have that throughout the whole data and also in our yl file just close this we have our data. gaml so here we actually need to add this Dash so let's do that to start with and again you guys can follow this through exactly step by step and this is how you can train the models then I also like to go in and just take all of these path individually so let's copy this path here train validation and test because sometimes it is not it doesn't know that it's inside like content in a Google cab notebook so here we have our train we have our validation so that will be valid and then we have our test images here as well so this basically just the data set or like the data file um that that specifies like the config for our training so now we have that we can close that we have all the labels we have the images so right now I'm going to create this blogger code that is going to do the conversion it doesn't really do too much it's just taking the class names you can take that directly from the data yl file that you got from roof flow just make sure that you act like add these dashes so that's all you have to do there then we specify the image width and also the image height of our images because we're going to normalize all the values to be able to train our models or else it won't work then we can specify the folder path so again you can also just create a followup running through all these folders but right now I'm just going to do it individually for all the free splits we also need our class indenes because again we have individual like labels so we don't have the labels we actually need class indexes so that's how we can go and create a dictionary with our class name so we can map that to specific values which is the format that we need to use for allytics and youate for or any bounding boxes so we specify the follow path here content data sets cop segmentation to valid and dead labels so we only have to to actually like convert the labels the images or just raw images we don't have to do anything with that we have a function for normalizing the values we basically just take the coordinate and the image value or like the coordinate the bounding box values then you divide it by the maximum values which will be the image height then we can go and create a followup here basically just running through all the different files in our directory and in the folder that we have specified up here then we're going to take all the text files we're going to take the specific file path then we're just going to open up that file read in all the lines then for all of the lines we're just going to have a follow Loop running through all the lines so if you just go inside our labels again we just want to extract each of the these individual lines and have a follow Lo running through each of them doing the conversion and also just saving them again into a new text file in the correct format so that's pretty much everything that is doing so if length part is equal to 10 which means that we have a correct um and a valid annotation in our labels then we're going to extract the labels which will be the second last element we have the coordinates which will be the first eight elements and then we're going to do this mapping from our dictionary from the class labels to class IND index we then check the class index if that is valid we go down and normalize our coordinates then we can create a new line here which is just taking the class index so this is the first index or like the first value that we want in our ul8 or in a bounding box format and then we're basically just going to join all the different nor Iz coordinates together so we have our X1 y1 all the way up to X4 y4 we're just going to join all of that together with spaces in between then we can append this to the new lines list that we have up here then we can open up a new file and just write out each of these individual lines that we have just extracted and converted and then we have everything so let's not just try to run this let's start with our validation so that is our test set up here let's just run this block of code and we should be able to do it it did run let's go down and check our validation if we go inside our labels let take the first one here so now we have our class index and then we have all the values here for our bounding or in the bounding box right after that so this is the correct format and now we can actually go in and train our model but first of all we need to make sure that we actually like do it on each of these um folders so we have a test and then we also need a train again you guys can just like create a follow running through all of this but again this took like around 2 seconds to do that it will take me 5 Seconds to create a fold I'm trying to optimize my time um so yeah let's go and check the other ones here let's just take train we did that at the last one the label in the correct format so now we have everything we have our whole data set we have annotated our data set exported it done the formatting and now we can just go in and train it directly with the commands from alra litics so let's now close this one now we can go inside custom Training we have our task obb for are in the bounding boxes we set it into train mode we can specify what type of model we want to use again they have these five different models depending on the size that you want to use set the data data set. location and then we need to specify our data. JL file which is just mapping to the directories with our images and also our labels and that's also why I called um the other folders for labels because it's going to extract that we can specify the number of epoch the image size that we have export our data set into and then we can also specify the bat size so just change the bat size here to eight because we have 640 images so we should be good with that now we can just run this blogger code and it will start the training it's going to set up the whole model going to download the model start the training Loop and so on and then we can follow the metrics and follow the progress throughout all of the epoch here so here we can see that it has loaded in 330 images for training validation 19 images um zero corrupt labels so it it is actually able to go in and extract all the information correctly so that just verifies that all of our annotations all of our formatting and so on to the correct format is act like done correct or you'll get like a ton of warnings here so definitely make sure that you check that if it look like this you're good to go and your model will start train we can even see here like after the first iteration or like the first Epoch we have a mean average position 50 of 0.583 so that is already like pretty good but again we're training on Cops and that is already in in most of the pre-train models and a relatively easy task to do but now we already have like a mean average position of close to one so that is the ideal value so we can see that already here after like the second Epoch we already have like a model of 98 in mean average position .5 so that is close to one which is the ideal value and we can also see the meaners position 0.5 to 95 in intervals of 05 so it looks pretty good our model is already like pretty much converged so it doesn't really matter like too much to train for for longer maybe our class loss is increasing as we can see here so yeah kind of like makes sense to do that until our class laws act like decreases so let's just let it run here and then we can take a look at the results afterwards so our model is now done training we can see the validation results down here at the bottom we can see the number of images that we have in our validation set we can see the inference time and so on and we can also go in and see all the different metrics so again this looks very good our model has prely converged and is very accurate at predicting these cup that we have in our data set so let's now go over here to the left then we have our runs directory Now obb train we have the weights we can go and extract the best weights here let's go in and download that so we can use it in our own custom script we can also take our last Model um but in this example here they're probably like both pretty similar we can then go and extract both our Precision recall curve if one curve confusion Matrix and so on we can also see the number of labels results but we can also just go in and use these lines ex directly so then we can go and take the bit different images here both our results validation batch and so on so this is how we can see our model is decreasing over time so we're taking a look at the class Lots that's decreasing pretty much converged at the end we can see the metrics for precision it's going up our recall is going up so our Precision recoil curve is very nice class loss is going down for our validation set and also the mean Precision is going up so we have a really good model has Precision close to one and also recoil close to one so we probably can get a better model for this we can take a look at a validation patch here for our predictions and this is basically the Rend bounding boxes that we get for our cups so now we can actually see it fits these bounding boxes it rotates and Orient these bounding boxes around our um Optics now compared to just our traditional bounding boxes that we know so this is very awesome very useful in different applications let's say that you have like cars from aerial images driving around now we can actually like follow the cars better around we can get these fitted bounding boxes so this is kind of like segmentation but fitting our bounding box around it raer to label and so on if we have like an annotation tool that can do our bounding boxes directly compared to setting up instant segmentation and so on but this is the validation results look very good again like 100% accuracy here 70% confidence score 90 90% confidence score so here we can see all the different labels white cup Halloween cup hand painted cup and so on and this is the cup that we're going to use now so now go down and validate costom model we can just directly run that we're just going to change our validation so I have to stop this because our task is act like obb so task is obb do validation and then we just need to specify the directory for our best trained weights so let going to run that and use this model for validation and we can do the exact same thing with our predict we just specified the mode set that equal to predict and we need to change this to obb as well train one because we only have one train run confidence score where we want to to have the images from the source and if you want to save them as well into directory so we can take a look at it but again this is just the exact same thing that we get after training so it's running validation after the training then we can do inference with our custom model right now we're just going to run through all of our test images so we can go and visualize all of that then I'm just going to have this follow running through all the predictions so we can re realize the results I usually like to do that before we export the model and using our own custom scripts because just to make sure that our model act like generalizes we get good enough predictions or we want to iterate back find too data set retrain our models test and tune some different parameters before actually like going in and using in our own applications and projects so now we have that we can go and run this directly so this will be all the images that we're just running through let me just zoom out so now we can see that we have these orang bounding boxes look very good we're basically detecting every single cup out there so here we can see that because we actually like annotating the the handle as well is just trying to do the segmentation mask and fit the orang the bounding box around it that's why we get some some weird bounding box in this example but again you can use your own data sets whatever data set that you want to use out there if with your labels so you actually get a double label here both hand pained cup and also like a white cup so that's a bit odd but we can see that is fairly like pretty good in general across all our chest images so we downloaded the model we took the best weight I've then imported into my new directory here for a custom project that we're going to set up so this is all the code that we need to do from allytics we're going to import YOLO we can create an instance of our YOLO model then I've just renamed my best model here to cup- op for R Bounty boxes and this is how you can use a custom model you can also go in and specify the the YOLO model for the pre models use that directly but I have videos about that here on the channel so you only need these free Lan code import YOLO set up an instance of a model and then run predictions we can either like throw in a YouTube url here URL to like whatever image on the internet video webcam stream nonpr PL image and so on we have a number of different parameters that and also arguments that you can find on the alytic documentation but right now I'm just going to show the results and also save the results for later use so everything that we need is this file here we need our custom some train model and then we also need a video that we're just going to throw it through so this is the video that I've just created this is not the exact same background and so on that I've trained the data on the camera is also probably like a bit closer compared to the data set but the model still generalizes pretty well so this is the image and this is the video that we're basically just going to throw through the model let's just do it let's open up a terminal run the program and see the results so let's open a terminal new terminal going to open a command prompt cond activate python ob. piy let's run it now and see the results it's just going to create an instance of our custom train model and now it's going to run the video I'll just drag it over here so we have the hand pained cup it also detects the Halloween cup here as well again this is not the exact same data set and background and distance camera and so on that has been trained on so that is probably why we get these false predictions for um some of them but we can see the hand ped cup it does a really good extraction of that it does the bounding box it rotates it fair nice again we're not doing any tracking at at all so right now I'm just going to put it here on the table and now we can see the results are um a bit better so now we can see the results it does detect something here on the left with my airpods so it thinks that that is a hand pain ined cup but again this is not like this is not similar this is like very far away from the data set that we annotated but it is a bit old data set so here we can see like when I'm actually taking the camera further away it detects it as a cup but if it's too close it doesn't really recognize it as the handp painted cup any longer so here we can see that our R in bounding boxes it is fall following the object around so that is very nice now we can also go and see inside our runs predict free then we also get the results extracted so you can use that video later on because we have this argument set equal to true for saving the video and the results so this is very awesome we have the Ron Bing boxes that rotates around we get all these extractions you can extract all the results again this is just like an arbitrary video we can go in and F tune our data set make it more General have different backgrounds have different distances so again sometimes we can see that the video or the camera was just too close to the cup to be able to detect it because it was like trained further away and also with a different background orientations and so on but just sort of validation results when we trained the model so that's pretty much everything we've been through the whole pipeline for training these R the bounding box models for Yul 8 so I hope you guys have learned a ton definitely try it out I'll throw the Google cab notebook under the video here in the description check it out try it out your own custom data set this is really useful for opdate detection applications where you want to have like these bounding boxes oriented rotating together with your Optics instead of just having these Statics and non-dynamic bounding boxes as we have been used to with 8 and also a ton of other different optic T and models out there so I hope you have learned a ton definitely try it out I hope to see you guys in one of the upcoming videos until then Happy learning

Info

Channel: Nicolai Nielsen

Views: 6,964

Rating: undefined out of 5

Keywords: yolov8, yolov8 neural network, yolov8 custom object detection, yolov8 object detection, yolov8 tutorial, object detection yolo, object detection python, opencv yolov8, opencv python yolov8, object detection yolov8, opencv, detect objects with yolov8, yolov8 opencv, opencv dnn, deploy yolov8 model, object orientated bounding box, yolov8.1, yolov8 obb, how to train yolov8.1, custom dataset yolov8.1, deploy yolov8.1, train yolov8.1, oriented bounding boxes, train custom yolov8

Id: UG4W3yBTJzM

Channel Id: undefined

Length: 25min 46sec (1546 seconds)

Published: Sat Jan 13 2024