YOLOv8: How to Train for Object Detection on a Custom Dataset

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
yellow V8 is out [Laughter] latest installment of Highly influential you only look once algorithm has just been released and the team behind it claims that this new Sota when it comes to object detection in real time I guess we just need to wait for paper with code to confirm that however internally we measured yellow V8 against yellow V5 and yellow V7 using our roboflow 100 data set and the results suggest that the model fine tunes much faster than its predecessors so if you want to stay on a bleeding edge of computer vision sit down relax and let me show you how to train YOLO V8 object detection model on custom data set and if you want to learn more about roboflow 100 click the link in the top right corner or in the description below in this video I will take you through the process of training your own yellow V8 model on custom data set this time we'll focus only on object detection but no worries instant segmentation and classification videos are coming quite soon maybe by the time that you watch that video they are already there if that's the true you most likely see their card in the top right corner if you want to be up to date with those videos make sure to like And subscribe this way you will know first when the new video gets released I will show you how to train validate predict and deploy the model and we'll use football player detection data set available at roboflow Universe to do all of that so if you are here mostly to learn about the training process feel free to use the chapters in the description to navigate to the part of the video that is most interesting to you but a part of talking about the model itself we will will also talk a little bit about the changes that were introduced in the code base because let me tell you it is most likely the biggest engineering jump since the migration from darknet to pytorch no more train that pies no more forking their repository to make it work with any tracker apparently from now on we'll be using cli's and sdks and we'll take a look at that new API in just a second yellow V8 is created by ultralytics team the company behind yellow V3 and yellow V5 two repositorities repositories that collectively have almost 45 000 stars on GitHub on the day of recording so pretty serious contributors to open source if you ask me of course both of those projects had their issues lack of paper that irritated large part of data science Community lack of Peep package that make it really hard to deploy the model that irritated large part of engineering community apparently both of those issues will be solved in yellow V8 project time will tell but in meantime let's just jump into the code and train some models before we start just a quick disclaimer here the whole Library feels a little bit under baked so I fully expect some small API changes to be rolled out in upcoming weeks or months the materials that we prepared for yellow V8 video series however are part of roboflow notebooks repository and we will update them after any major change will get released by YOLO V8 team that's why if you see any difference in the API between what you see on the video and in a notebook keep in mind that the video was released right after the launch and the notebooks are updated on a daily basis without further Ado let's jump into the code here we are in the notebook and let's start by just confirming that our runtime is GPU accelerated and indeed that's the case uh so to just confirm that let's run Nvidia semi command if indeed we have access to the GPU we should see similar result after that command is being run the initialization takes a little bit of time but yes we have the access to the GPU Tesla T4 we are ready to go now let's create a small helper variable called home that will allow us for easier management of paths to data sets and other images and at this point we are basically ready to install the actual YOLO V8 package according to the documentation there are two ways to do that we can either use peep as I said this is the first iteration of YOLO that have official peep package or we can go the usual route by just cloning the repository and installing the dependencies I myself select the first one so we are going for pip install ultralytics but in a notebook you can do whichever you want account just make sure to comment the first one and then comment the second one if you go for git clone just to make sure that you don't run both of those cells in a single environment now we can simply import Yolo from ultralytics so let's do that now and if we don't see any exceptions everything works properly as I always say the best way to make sure that everything was installed properly is to run inference we can use pre-trained coca model to do that and it's actually a great opportunity for us to get used to the new API that I was talking about in the intro like I said we have two new ways to interact with yellow code base we have CLI and we have SDK CLI is quite similar to what we are used to in the sense that we are running that from the command line from the terminal but instead of running the python script directly right now we have the tool that we can run basically any directory that we want previously that was possible but it was a little bit of headache and like I said we have the SDK SDK is the pythonic way of using yellow so typically we will just import yellow from ultralytics create the instance of the model and do something with it let's compare those two apis using prediction as an example let's start with the CLI it has a concept of task and the mode and this is very important because that's how they compact everything that was a separate script previously into a single command first we select task and task is basically detection or segmentation on classification so for this tutorial we select detection and the mode so this can be trained validation or predict and for this particular example we'll use predict the next thing is obviously to load the model you can do that by AI providing a path to weights that you want to use in this particular example we are using yellow V8 n model the rest is quite typical we provide confidence and all other hyper parameters that we would like to use and at the very end the source I'm using a typical image from my private Gallery which is me and my dog so let's try to run that command first as expected the first thing that is happening is downloading the image then it's downloading the weights now we load the model into the memory that can take few seconds and after that it displays what was detected and the results are saved into the runs directory this is exactly how it was happening in the case of Python scripts so you can see over here we have runs detect predict and over there we will be able to find our result image so now I can basically load that image and display the result the name of the image in the results directory is the same like it was in The Source directory so I'm just clicking shift and enter and we can see the results now we can get exactly the same result using the pythonic API so we already imported the YOLO a few sales before now we can create an instance of the model and run prediction on the same image and like before the model is loaded into the memory and the results are being displayed and we can compare the results one person one car one dock and over here one person one car one dock cool let's switch the topic a bit and discuss the data set that you need to have to train the yellow V8 latest iteration of yellow is supporting exactly the same data format as yellow V5 so if you already have the data set that you used previously and you would just like to retrain the model feel free to use that but if you don't let me quickly show you how can you create that data set using roboflow let's go to roboflow.com if you don't have account obviously you need to create one if you have it already just sign in let's create new project in the first drop down we select object detection now let's say that we will detect football players and the name of the project let's call it football players detection the license can stay as this and let's create public project now if you have both images and annotations feel free to drag and drop them over here I have annotations but for the sake of example I will just drag and drop images and we will go through part of The annotation process together now you can press save and continue button and that will result in uploading all of those images to the server let's use the magic of Cinema and just move forward now after a few minutes depending on your internet connection and the size of your data set you will end up on this screen from here you can assign labeling tasks if you are independently like myself you have no other choice you need to assign it to you if you have friends and you can invite them into the workspace and they can help you out with annotation like I said in my case I just select myself and click assign images when the labeling job is created you will get redirected to this screen and from here you just click on the first image and just start labeling The annotation process itself is pretty straightforward you just draw bounding boxes around the object that you would like the model to detect the work is pretty tedious but the good news is that you don't need to annotate the whole data set you just can annotate part of that train the model and use that model to annotate the rest of the images with every iteration the quality of predictions should get better and better we already have tutorials on how to do that so we can look for the links in the description below and in the card in top right corner when The annotation process is done you just submit your images for review and here you can approve or reject the annotations right now I have all my images annotated so I'll just approve them all and add them to the data set now I'm ready to generate the first version of the data set so I got redirected to generate new version Tab and over here I can apply all sorts of Transformations for example resizing now let's add augmentations I don't want to go too hard so I'll just add a horizontal flip for the images I press continue and generate and after just few minutes my data set should be ready as we already have the data set there is nothing else for us to do then just download it into our environment and start training now we can use one of the features that we added specifically for yellow V8 launch we can press that button over here and that will generate a code snippet that will allow us to download our freshly created data set in yellow V8 format we can just copy it and paste it into the notebook so let's do it right now I can just highlight that text paste it here press shift enter and the downloading will start immediately keep in mind that that code snippet contains your API key it is not visible in the UI but when you paste it it will be in plain text so just watch out make sure not to comment that accidentally into your repository especially if that's a public one looks like the download of the data set has completed and now the only thing that is still left to do is to execute the training itself and let me just adjust a single hyper parameter limit the amount of epochs from 100 to maybe something around 25 this is pretty large data set so that will take like an hour to complete shift enter and the training can start all right the training has completed and now the time has come for us to review the results of the training so the first thing that we'll be doing is to take a look at some charts produced by yellow V8 that are saved into the runs directory that will tell us a little bit more about the training process and the expected accuracy of our model first let's take a look at the confusion metrics so confusion Matrix is a chart that shows us how our model handles different classes so for example if we talk about goalkeeper a 74 of the time it's detected correctly and it's classified as a goalkeeper however 26 of the time we get the bounding box but the goalkeeper is incorrectly classified as a player on the other hand in case of the ball which is quite small object and it's hard for them model to detect we only detected 26 percent of the time however 74 of the time when the ball is there we are simply not detecting it all classes are part of ball are detected correctly most of the time and given the fact that we only trained for 30 to 40 minutes I am more than satisfied with the results now let's take a look at the charts displaying key metrics tracked by yellow V8 object detection so over here I'm usually mostly interested in box loss and class loss for train and validation data set we see that the behavior is correct the model is converging and it's clear right now that the training could take even longer and we would get even better results and the curve on those charts is still pretty steep that means that we still would be able to get a lot more from those models usually the sign that you are training for 2 long is very flat part at the end of the chart that means that you just burning time on your GPU but your metrics are not improving and the last image contains example of model predictions on the validation batch those images are not used strictly for training so it's always nice to take a look and kind of like visually confirm that the model is behaving how you expect now we can take our freshly trained weights and use them to validate the model so similarly as before we will use CLI to do that the difference is however that this time we are using mode Val contrary to mode detect or mode train and validation script is using the part of a data set that was not used before the test data set and those are the true map metrics that we should care about finally let's take our model and just use it for inference I will use the same test subset of our data set but this time I just want to see the results so as usual the model is being load into the memory and because we are using GPU the inference is pretty fast and over here we can plot few of those images to see how the model is performing so this particular result looks really good the referees are being detected the goalkeeper is being detected the only thing that I don't see over here is the ball but that may be for the reason it's getting terribly occluded by one of the players obviously you can use yellow V8 not only to run inference on images but also on videos so I decided to upload one of the videos that I had on my machine and just run the inference on it the only difference between this command and the previous command is that this time I passed the path not to the images directory but to a MP4 our file the rest is exactly the same I'm just pressing enter and shift and we just need to wait for the inference to complete as Google collab is notoriously terrible at displaying videos in UI I decided to download the result and here you have it a short snippet of what is possible with yellow V8 keep in mind that that model was trained only for 30 minutes and it is actually one of the smallest architecture yellow V8s if we would optimize for the accuracy most likely we would select a larger version of yellow however if you take a look in the locks you will see that inference on a single frame takes between 12 to 13 milliseconds that means that the inference at around 80 FPS as well within our reach now before we wrap it up let me show you one more thing which is how to deploy your freshly trained model and use it to infer over the API and all of that magic can happen with just a single line of code after we trained our model we just do project version deploy we pass the model type and the path to our train directory hit sheet enter and that will basically upload our weights to the server then after the upload in the jupyter notebook is completed we can go into our roboflow account next to the version of the data set that we are using there will be a green check mark that basically is here to let us know that the model was successfully uploaded and the best thing about it is that now when I go into the deploy tab I can use that model to infer using our hosted API one of ways that I can do it is basically by drag and dropping the image into the browser let me do that right now but besides that I can now infer using the API from my code base we provide plenty of Snippets in different languages or I can even spin my own info different server for example on Nvidia Jetson that's all for today this video is just the part one of the whole series that we plan to release about yellow V8 so if you are interested in instant segmentation classification or in general comparing yellow V8 to previous object detection models stay tuned for our future videos and obviously the best choice is to like And subscribe this way you will get notified about those videos as soon as they will get released my name was Peter see you next time bye
Info
Channel: Roboflow
Views: 177,467
Rating: undefined out of 5
Keywords: yolo, yolov8, object detection, computer vision tutorial, roboflow, ultralytics, yolo object detection, deep learning, yolo object detection tutorial, machine learning, yolo deep learning, object detection classifier, yolo python, object detection deep learning, yolov8 object detection demo, object detection using yolo, object detection using opencv python, yolo object detection pytorch, yolo object detection video
Id: wuZtUMEiKWY
Channel Id: undefined
Length: 20min 31sec (1231 seconds)
Published: Tue Jan 10 2023
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.