YOLOv8 custom object detection | Faster inference with OpenVINO

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
Hey guys, welcome to my channel. In this video,  I'm going to show you how you can train your own custom object detection model using YOLOv8. So  let's begin, so the first thing that you need to install is the Ultralytics Library. So I already  have it installed, so just type pip install ultralytics and it should install it in your system. I have  the 8.0.131 version, yours could be different. So for the sake of this tutorial we'll be using  a Parking Lot data set, which is available on Roboflow. So we'll just click here and  we'll download it in YOLO v8 format okay, since this tutorial is based on YOLO v8. So we'll  be downloading it in this format and then we'll click on continue. Once you click on continue,  you'll get some downloaded images. So once you've downloaded the zip file and extracted you'll get  three folders within it- train test and validation. It contains more than 12,000 images in total  but, if you want you can just use a part of the data set, if you have a limited computation power.  So for that, I have written this for postprocessing the data set, so basically we need training and validation  images during the training of our YOLO model. So I have split the images as follows I've  used 5,000 images for training and 300 images for validation. You can use as many images as you  want but if there's a computational limitation, then you can cut down on the images that you  have so yeah, this part is basically optional, if you want you can skip it. The next part involves  training our YOLO model, so for that we'll write this piece of code. So the data argument takes  in the data.yaml file. I'll explain it to you shortly what this data.yaml file is. imgsz  argument basically just resizes your input images into this particular size, so this is usually a  multiple of 32. You can keep this size more if you have more GPU computational power. If you have less  GPU computational power, GPU or CPU power then you can keep it a little less like around 320 or  something. Then we have epochs & batch size. These are the basic terms of machine learning  or deep learning, which you must be knowing. So epochs basically mean, how many times you  want your model to go through data-set and batch size means after how many images you  want your model to update its weight. And this is the name of the model that  will be saved, so this is the data.yaml file which I was talking about earlier. So first  you have to define the path, so the data.yaml file and other codes are present in this park_count folder,  so we need to first mention you need to write two ../ and name of the folder where all your codes  are present and the datasets folder where your data set is stored okay then, we'll mention the  training path so the train and validation folders are present within this datasets folder, so we  need to give the path of the images. So it will use this path and this path to..  get to the images, training and validation images. Next we have number of classes which is two and  this is the this is the name of the class so first is space-empty. Zeroth index stands for space  empty and the index one stands for space-occupied. So this is our data sets folder, so let's go  through one of the text files. So as you can see first this is the first index it stands  for space-occupied this stands for space-empty and this is the, you know images folder  and these are it's these are the corresponding labels and we similarly have a validation folder  as well. So now we'll run this training file. So we'll just type in three and the  name of our training file. So this will automatically start a training process. I've  already trained a model, so this might take a few hours depending on the size of the  GPU that you have. If you have a better GPU then it will take lesser time for it to  get trained and once that's done you need to go to the runs folder, there you can go to  this folder and here you can see there are multiple folders, because I've tried it multiple  times before and you need to go to the weights folder and here you'll find the best.pt file.  So this is a model which contains weights, which have you know least amount of loss.  Okay, so you can use this for inferencing. Next we'll use our model for inferencing,  so first we'll import some libraries. Here we have imported cv2 numpy and few  functions from the Ultralylitics library. So I have renamed and kept a model in this models  folder. So this is the model that we'll be using park_best.pt So here we'll call the YOLO  function for reading a model and the model, model.names will give us a list of you  know the names of the classes. Then we'll read the image using cv2 library and  we'll use our model for to perform predictions on the image. Then we'll convert this  image to RGB format because PIL uses RGB format and cv2 uses BGR format right. So results over here  consist of an array, so we'll you know try to iterate over it. Here we'll call the annotator function and for each element in results, we'll create a variable called boxes. So this  again is an array, then we'll loop over this boxes variable and box.xyxy of 0th element consists  of coordinates of the box & box.cls consist of the you know class name. Then we'll try to   decode the class name from the det_model.names and then we'll use annotator.box label for labeling, you  know labeling each of the boxes. So your box should contain the coordinates, so here we have  the coordinates class name. We get class name from here and then colors we get from you know colors  function that we have imported over here. So since we using this in of C so each class gets a unique  color automatically, so you don't have to you know give a color the give a color manually.  Then we'll finally try to you know display the image and here we have the annotated image. Of  course this is not very accurate because I was not able to you know train train it with large amount  of images with high number of you know epochs because of computational limitations. So once your  model is ready, next we'll convert it into an Open Vino intermediate representation format.  So for that, you can refer to this repository of mine. From this repository you can refer this file  specifically for the code for converting for model conversion. So this code contains a few helper  functions as well with descriptions which you can refer. So once you have converted the model into  Open Vino format you can use this piece of code for inferencing and as you can see the the Open  Vino model works pretty well. Here it has missed a few empty parking spaces, that's because I trained  the original YOLO model on fewer epochs, due to computational limitation, but if you if you can  make it more accurate it will surely work better. So yeah I hope you found this video helpful and  don't forget to Subscribe. Thank you very much !!!!
Info
Channel: ComputerVisionPro
Views: 132
Rating: undefined out of 5
Keywords: computer vision, computervisionpro, opencv, image processing, python, computervision pro, computer vision pro, computer vision tutorial, artificial intelligence, yolo, yolov8, openvino, object detection, inferencing, real time yolo
Id: 6dzMN72Lc8I
Channel Id: undefined
Length: 10min 34sec (634 seconds)
Published: Mon Jun 17 2024
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.