YOLOV8: Train a Custom YOLOv8 Object Detector

Video Statistics and Information

Video

Captions Word Cloud

Reddit Comments

Captions

YOLOv8 is the latest induction to the YOLO family that Ultralytics has released. Not only can YOLOv8 detect the objects in an image, it can also segment and classify the objects. Hey there, Welcome to LearnOpenCV. In this video we will execute a notebook and see how to fine-tune the YOLOv8 models. YOLOv8 has 5 models. It ranges from the Nano model, Small model, Medium, Large and Extra-large model. All of these models has 640 input image size but as the model size increases the number of parameters increase as well as the mAP Value and the inference speed. We can infer that the Nano and Small models are good for Edge or mobile deployment, whereas the Large and Extra-large models are good with accurate detections. The Medium model is a balance between both speed and accuracy. To fine-tune these models, we have created a custom pothole dataset for a usecase. We have collected the images from the Roboflow pothole dataset which has 665 images. A paper published by the Stellenbosch University in 2015, the road damage detector paper published in 22 and a few manually annotated YouTube videos. This is what the dataset looks like. There are two main folders, Train and Valid folder. The images in the Train directory are used to train the model whereas the images in the Valid directory are used to evaluate the model. If you want to know what to be careful about when preparing a dataset, check out the Video "Common pitfalls to avoid in object detection dataset". Now let's execute this notebook and check how to fine-tune the YOLOv8 models. We will be downloading, unzipping and visualizing the pothole dataset. We'll also create a dataset Yaml file which is used for training then we'll install the Ultralytics package to train the YOLOv8 Nano model. We'll show both the methods, the CLI method and the Python API method, then we'll evaluate the model and test and visualize the results. Now let's import some libraries. We'll use these libraries to download and visualize a dataset. First, we create a datasets directory and we change directory into it. Now we have two helper functions with us. First we'll use to download the dataset file. It accepts the URL and what name to save the file as. If the file does not exist, we use the Requests library to save this file. Then we simply pass the URL and the name to save the file and we execute this function. Then once we download the dataset, we Unzip this file using Zip file library. Now let's check the number of training and validation images that we have downloaded. We have 6962 images in our Training set and 271 images in a Validation set. Let's go back to our previous folder and continue from there. Now to visualize the images in the dataset, we first need to convert the YOLO annotations to bounding box annotations. Currently the dataset is in YOLO format; that is, the annotations are in Center x Center y width and height format. We use yolo2bbox utility function to convert the bounding boxes to x Min y Min, x Max y Max. The function Plot box will plot the boxes on the potholes, but our annotation was in YOLO format and the YOLO annotations are normalized, which means they are in the range of 0 to 1. So first we need to denormalize each annotation to its image resolution that is, its width and its height and then we'll be drawing rectangles over potholes using OpenCV function rectangle. Finally we use the function plot to use Matplotlib to plot a few samples. We take input of image paths, the label paths and a few number of samples that we'll be plotting. Finally when we call this function Plot it will plot a few samples from the dataset for us. Next we'll write a dataset Yaml file which helps for the actual training of the models. We need to save the path to the datasets as well as all the labels in our dataset. You can see we are putting the Train directory, the Validation directory, as well as our label pothole. Then we install the Ultralytics package using pip. Now let's actually train the YOLOv8 Nano model. We'll be first seeing the CLI method. Let's set up a few parameters first. We need these parameters to train our model. First EPOCHS. EPOCHS are training cycles or the number of times we iterate over are training data. Then we have Pad size. Pad size is how many images to train on at a time. If you set this value a bit low or very high your model might overfit or under fit, and finally we set the image size as 1280. Now if you remember the Ultralytics GitHub page, all the YOLOv8 models has 640 input image size whereas here we are specifying 1280 input image size. This is because a dataset has very small objects so if you take higher input image size we might get a bit better results. So let's go with this configuration. This 10 line command is all we need to fine-tune our YOLOv8 model. Let's check it out in a bit more detail. We use the YOLO command followed by a few switches. First is the task now you must already know YOLO can perform many tasks like object detection, object classification and object segmentation. Since we are only concerned with detection, let's go with Detect task, then if you are training or fine-tuning the model let's go with Train otherwise, we also have Validation or Prediction. We'll look into it later, then we need to provide a model. We are using the YOLOv8 Nano pre-trained model then we pass the input image size, the dataset Yaml file, the number of EPOCHS, batch size followed by the name. This is the name where all our outputs will be saved. Now let's execute this cell and check what the output looks like. First and foremost Ultralytics will download the YOLOv8 Nano model, then it posts the system and environment configuration and then it posts the training configuration. These are advanced options. We explained these and a lot more in detail in our courses Deep Learning with TensorFlow and Keras and Deep Learning with PyTorch. Do check those out then it uses the dataset yaml file to take the number of classes from it. This is the model architecture. We can skip this because we did not alter the model at all. This model has 225 layers and around 3.01 million parameters. The optimizer that we'll be using is SGD, this has already been configured by Ultralytics. This library also has many utility functions. Do you see these warnings, this is because the dataset has a lot of duplicate labels. Ultralytics Library removes this duplicates, which makes the training better. It also uses album mutations library for augmentations. This is a training. During training it outputs quite a few metrics. Let's check this out first. So for each EPOCH it specifies, the GPU memory used the Box Loss, the Classification Loss, dfl Loss and the Number of instances. Box Loss is nothing but Localization Loss whereas Classification Loss specifies whether the object is classified correctly or not and dfl Loss is also known as Distributed Focal Loss. It's a new type of loss for anchor-free object detectors and the number of instances is the number of labels in the current batch. The size is the input size to the model, if you remember it's 1280. This is the Verbose output, it shows the training steps taken after each EPOCH is complete the model will be evaluated over a validation set. So there are 271 images, 716 instances and we also get the Box precision, the Recall value and the mAP value. Once the evaluation of the fifth model is complete, Ultralytics will take the best model and evaluate it on the Validation set. Finally all the results will be saved to this directory. The same code in Python API can be converted to this. Now to run evaluation, we'll use the same command YOLO. This time our options will differ. As usual, our task is going to be Detect but this time our mode is going to be Validation mode or Val. We pass the best model and we give a new name for the results directory. Finally we pass the dataset Yaml file, it outputs the same thing. To run inference on validation images, we use the same YOLO command but this time the mode is going to be predict. We pass the best model and pass the source image directory. Moreover we can put Show labels as False because we only have one label for this dataset. Let's run both these commands. Finally it saves the output in this directory. To visualize these images, we have created a helper function Visualize which will take the results directory as input and the number of samples to be printed. It will take the result images on random and display them. So that's how you fine-tune YOLOv8 models on a custom dataset. If you LIKE this Video, check out our YOLO Masterclass Playlist on YouTube where we understand the workings of other YOLO models. Do COMMENT on what you would like to see next and don't forget to SUBSCRIBE. Thanks for watching!

Info

Channel: LearnOpenCV

Views: 44,687

Rating: undefined out of 5

Keywords: YOLO, Object detection, Computer vision, Deep learning, Neural networks, Artificial intelligence, Machine learning, Real-time detection, Image recognition, OpenCV, YOLOv8, Training models, Computer vision tutorial, Machine learning algorithms, Image processing, how to YOLOv8, yolov8 training, yolov8 custom training, training yolov8 with custom data, training yolov8 on custom dataset, yolov8 training on custom dataset, yolov8 best training methods, yolov8 model training

Id: ZUhRZ9UTkIM

Channel Id: undefined

Length: 11min 37sec (697 seconds)

Published: Mon May 08 2023