Hey guys, welcome to my channel. In this video,
I'm going to show you how you can train your own custom object detection model using YOLOv8. So
let's begin, so the first thing that you need to install is the Ultralytics Library. So I already
have it installed, so just type pip install ultralytics and it should install it in your system. I have
the 8.0.131 version, yours could be different. So for the sake of this tutorial we'll be using
a Parking Lot data set, which is available on Roboflow. So we'll just click here and
we'll download it in YOLO v8 format okay, since this tutorial is based on YOLO v8. So we'll
be downloading it in this format and then we'll click on continue. Once you click on continue,
you'll get some downloaded images. So once you've downloaded the zip file and extracted you'll get
three folders within it- train test and validation. It contains more than 12,000 images in total
but, if you want you can just use a part of the data set, if you have a limited computation power.
So for that, I have written this for postprocessing the data set, so basically we need training and validation
images during the training of our YOLO model. So I have split the images as follows I've
used 5,000 images for training and 300 images for validation. You can use as many images as you
want but if there's a computational limitation, then you can cut down on the images that you
have so yeah, this part is basically optional, if you want you can skip it. The next part involves
training our YOLO model, so for that we'll write this piece of code. So the data argument takes
in the data.yaml file. I'll explain it to you shortly what this data.yaml file is. imgsz
argument basically just resizes your input images into this particular size, so this is usually a
multiple of 32. You can keep this size more if you have more GPU computational power. If you have less
GPU computational power, GPU or CPU power then you can keep it a little less like around 320 or
something. Then we have epochs & batch size. These are the basic terms of machine learning
or deep learning, which you must be knowing. So epochs basically mean, how many times you
want your model to go through data-set and batch size means after how many images you
want your model to update its weight. And this is the name of the model that
will be saved, so this is the data.yaml file which I was talking about earlier. So first
you have to define the path, so the data.yaml file and other codes are present in this park_count folder,
so we need to first mention you need to write two ../ and name of the folder where all your codes
are present and the datasets folder where your data set is stored okay then, we'll mention the
training path so the train and validation folders are present within this datasets folder, so we
need to give the path of the images. So it will use this path and this path to..
get to the images, training and validation images. Next we have number of classes which is two and
this is the this is the name of the class so first is space-empty. Zeroth index stands for space
empty and the index one stands for space-occupied. So this is our data sets folder, so let's go
through one of the text files. So as you can see first this is the first index it stands
for space-occupied this stands for space-empty and this is the, you know images folder
and these are it's these are the corresponding labels and we similarly have a validation folder
as well. So now we'll run this training file. So we'll just type in three and the
name of our training file. So this will automatically start a training process. I've
already trained a model, so this might take a few hours depending on the size of the
GPU that you have. If you have a better GPU then it will take lesser time for it to
get trained and once that's done you need to go to the runs folder, there you can go to
this folder and here you can see there are multiple folders, because I've tried it multiple
times before and you need to go to the weights folder and here you'll find the best.pt file.
So this is a model which contains weights, which have you know least amount of loss.
Okay, so you can use this for inferencing. Next we'll use our model for inferencing,
so first we'll import some libraries. Here we have imported cv2 numpy and few
functions from the Ultralylitics library. So I have renamed and kept a model in this models
folder. So this is the model that we'll be using park_best.pt So here we'll call the YOLO
function for reading a model and the model, model.names will give us a list of you
know the names of the classes. Then we'll read the image using cv2 library and
we'll use our model for to perform predictions on the image. Then we'll convert this
image to RGB format because PIL uses RGB format and cv2 uses BGR format right. So results over here
consist of an array, so we'll you know try to iterate over it. Here we'll call the annotator
function and for each element in results, we'll create a variable called boxes. So this
again is an array, then we'll loop over this boxes variable and box.xyxy of 0th element consists
of coordinates of the box & box.cls consist of the you know class name. Then we'll try to
decode the class name from the det_model.names and then we'll use annotator.box label for labeling, you
know labeling each of the boxes. So your box should contain the coordinates, so here we have
the coordinates class name. We get class name from here and then colors we get from you know colors
function that we have imported over here. So since we using this in of C so each class gets a unique
color automatically, so you don't have to you know give a color the give a color manually.
Then we'll finally try to you know display the image and here we have the annotated image. Of
course this is not very accurate because I was not able to you know train train it with large amount
of images with high number of you know epochs because of computational limitations. So once your
model is ready, next we'll convert it into an Open Vino intermediate representation format.
So for that, you can refer to this repository of mine. From this repository you can refer this file
specifically for the code for converting for model conversion. So this code contains a few helper
functions as well with descriptions which you can refer. So once you have converted the model into
Open Vino format you can use this piece of code for inferencing and as you can see the the Open
Vino model works pretty well. Here it has missed a few empty parking spaces, that's because I trained
the original YOLO model on fewer epochs, due to computational limitation, but if you if you can
make it more accurate it will surely work better. So yeah I hope you found this video helpful and
don't forget to Subscribe. Thank you very much !!!!