YOLOv8 is the latest induction to the
YOLO family that Ultralytics has released. Not only can YOLOv8 detect the objects in an
image, it can also segment and classify the objects. Hey there, Welcome to LearnOpenCV. In this video
we will execute a notebook and see how to fine-tune the YOLOv8 models. YOLOv8 has 5 models.
It ranges from the Nano model, Small model, Medium, Large and Extra-large model. All of these models
has 640 input image size but as the model size increases the number of parameters increase as
well as the mAP Value and the inference speed. We can infer that the Nano and Small models are good
for Edge or mobile deployment, whereas the Large and Extra-large models are good with accurate
detections. The Medium model is a balance between both speed and accuracy. To fine-tune these models,
we have created a custom pothole dataset for a usecase. We have collected the images from the
Roboflow pothole dataset which has 665 images. A paper published by the Stellenbosch University
in 2015, the road damage detector paper published in 22 and a few manually annotated YouTube
videos. This is what the dataset looks like. There are two main folders, Train and Valid
folder. The images in the Train directory are used to train the model whereas the images in the
Valid directory are used to evaluate the model. If you want to know what to be careful about when
preparing a dataset, check out the Video "Common pitfalls to avoid in object detection dataset". Now
let's execute this notebook and check how to fine-tune the YOLOv8 models. We will be downloading,
unzipping and visualizing the pothole dataset. We'll also create a dataset Yaml file which
is used for training then we'll install the Ultralytics package to train the YOLOv8 Nano
model. We'll show both the methods, the CLI method and the Python API method, then we'll evaluate
the model and test and visualize the results. Now let's import some libraries. We'll use these
libraries to download and visualize a dataset. First, we create a datasets directory
and we change directory into it. Now we have two helper functions with us. First
we'll use to download the dataset file. It accepts the URL and what name to save the file as. If the
file does not exist, we use the Requests library to save this file. Then we simply pass the URL and the
name to save the file and we execute this function. Then once we download the dataset, we
Unzip this file using Zip file library. Now let's check the number of training and
validation images that we have downloaded. We have 6962 images in our Training
set and 271 images in a Validation set. Let's go back to our previous folder and continue
from there. Now to visualize the images in the dataset, we first need to convert the YOLO annotations
to bounding box annotations. Currently the dataset is in YOLO format; that is, the annotations are
in Center x Center y width and height format. We use yolo2bbox utility function to convert
the bounding boxes to x Min y Min, x Max y Max. The function Plot box will plot the boxes on the
potholes, but our annotation was in YOLO format and the YOLO annotations are normalized, which
means they are in the range of 0 to 1. So first we need to denormalize each annotation to its
image resolution that is, its width and its height and then we'll be drawing rectangles over
potholes using OpenCV function rectangle. Finally we use the function plot to
use Matplotlib to plot a few samples. We take input of image paths, the label paths and
a few number of samples that we'll be plotting. Finally when we call this function Plot it will
plot a few samples from the dataset for us. Next we'll write a dataset Yaml file which
helps for the actual training of the models. We need to save the path to the datasets
as well as all the labels in our dataset. You can see we are putting the Train directory, the
Validation directory, as well as our label pothole. Then we install the Ultralytics package using pip. Now let's actually train the YOLOv8 Nano
model. We'll be first seeing the CLI method. Let's set up a few parameters first. We
need these parameters to train our model. First EPOCHS. EPOCHS are training cycles or the
number of times we iterate over are training data. Then we have Pad size. Pad size is how many images to train on at a time. If you set this value a bit low or very high your model might overfit or under
fit, and finally we set the image size as 1280. Now if you remember the Ultralytics GitHub page,
all the YOLOv8 models has 640 input image size whereas here we are specifying 1280 input
image size. This is because a dataset has very small objects so if you take higher input
image size we might get a bit better results. So let's go with this configuration. This 10 line
command is all we need to fine-tune our YOLOv8 model. Let's check it out in a bit more detail. We
use the YOLO command followed by a few switches. First is the task now you must already know YOLO
can perform many tasks like object detection, object classification and object segmentation.
Since we are only concerned with detection, let's go with Detect task, then if you are training
or fine-tuning the model let's go with Train otherwise, we also have Validation or Prediction.
We'll look into it later, then we need to provide a model. We are using the YOLOv8 Nano pre-trained
model then we pass the input image size, the dataset Yaml file, the number of EPOCHS, batch
size followed by the name. This is the name where all our outputs will be saved. Now let's execute
this cell and check what the output looks like. First and foremost Ultralytics will download the
YOLOv8 Nano model, then it posts the system and environment configuration and then it posts the
training configuration. These are advanced options. We explained these and a lot more in detail in
our courses Deep Learning with TensorFlow and Keras and Deep Learning with PyTorch. Do check
those out then it uses the dataset yaml file to take the number of classes from it. This is the
model architecture. We can skip this because we did not alter the model at all. This model has
225 layers and around 3.01 million parameters. The optimizer that we'll be using is SGD, this
has already been configured by Ultralytics. This library also has many utility functions.
Do you see these warnings, this is because the dataset has a lot of duplicate labels.
Ultralytics Library removes this duplicates, which makes the training better. It also uses
album mutations library for augmentations. This is a training. During training it outputs
quite a few metrics. Let's check this out first. So for each EPOCH it specifies, the GPU memory used the Box Loss, the Classification Loss, dfl Loss and the Number of instances. Box Loss is nothing
but Localization Loss whereas Classification Loss specifies whether the object is classified
correctly or not and dfl Loss is also known as Distributed Focal Loss. It's a new type of loss
for anchor-free object detectors and the number of instances is the number of labels in the
current batch. The size is the input size to the model, if you remember it's 1280. This is the
Verbose output, it shows the training steps taken after each EPOCH is complete the model will
be evaluated over a validation set. So there are 271 images, 716 instances and we also get the
Box precision, the Recall value and the mAP value. Once the evaluation of the fifth model is complete,
Ultralytics will take the best model and evaluate it on the Validation set. Finally all the results
will be saved to this directory. The same code in Python API can be converted to this. Now to
run evaluation, we'll use the same command YOLO. This time our options will differ. As
usual, our task is going to be Detect but this time our mode is going to be Validation
mode or Val. We pass the best model and we give a new name for the results directory. Finally we pass
the dataset Yaml file, it outputs the same thing. To run inference on validation images, we use the
same YOLO command but this time the mode is going to be predict. We pass the best model and pass the
source image directory. Moreover we can put Show labels as False because we only have one label
for this dataset. Let's run both these commands. Finally it saves the output in this directory. To
visualize these images, we have created a helper function Visualize which will take the results
directory as input and the number of samples to be printed. It will take the result images
on random and display them. So that's how you fine-tune YOLOv8 models on a custom dataset.
If you LIKE this Video, check out our YOLO Masterclass Playlist on YouTube where we understand the workings of other YOLO models. Do COMMENT on what you would like to see next and don't
forget to SUBSCRIBE. Thanks for watching!