Unlocking Animal Pose Estimation with YOLOv8: Fine-tuning for Dogs

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
YOLO V8 can seemingly perform any task be it classification detection segmentation tracking or pose previously we trained detection and segmentation models on a custom data set links to those videos in the description and today we'll see how to train the YOLO vate models on a custom poose data set hey there welcome to learn opencv we will be using the Stanford dogs data set for our training it consists of 12, 538 images which is split into 6,773 train 462 validation and 1,73 test images the keypoint annotations are provided across 24 key points of do B three for each leg 0 1 2 3 4 5 6 7 8 and 9 10 11 two for the tail 12 and 13 two for each year 14 15 18 and 19 one for the nose which is 16 one for jaw 17 one for each eye 20 and 21 one for Withers 22 and one for throat 23 the data set also contains the bounding box coordinates of the dog that being said the data set is redled with anomalies say if there are multiple dogs only single instance annotations are given the bonding box and the key points are annotated for two different object instances and the key points are being incorrectly annotated check out the blog post to know how to handle The annotation anomalies Link in description here is a sample data set annotation which is provided in Json format each entry is a dictionary containing the image and annotation information image information being the image path image width and height the bounding box coordinates a Boolean flag to State whether there are multiple dogs in the image and the key Point data it has x y coordinate and a visibility flag for the key point zero for not visible and one for Visible The annotation also has segmentation mask in rle format but we'll skip it now YOLO V8 does not input Json format data it inputs data in Yolo format which follows that each image has a corresponding text file of the same name and a txt extension and then each row in the text file corresponds to one object instance in the image and finally each row contains the object class index object Center coordinates XC and YC object width and height and the key Point coordinates p1x P1 y with the visibility flag now let's take this data set to Google collab and train yate models to get the code and follow along with me open the blog post link from the description then click on the download code Banner fill your details and hit enter while you're here open and fill this Google form as well this form will give you access to the annotations you'll receive an email with a GitHub link open the link and click on the download code button let's get started let's connect the notebook and check which GPU we are getting I do hope I get a00 GPU this time after the GPU gets initialized upload the annotations and install ultral litic here are the input commands and then we have a function to download the data set files and extract the tab ball now this is the images URL and the annotations URL we'll call the function twice and pass both the URLs to the function one by one after extracting all the files this is what a directory structure should look like the images folder will have a dogs folder inside which all the dogs images will be present now we will will load the train and test image indices which we will use for training and validation and restructure our data set so that it will be accepted by YOLO V8 here we get all the image paths and create folders for it and then we get the train and validation indices then we will simply copy these images from the indices to the newly created directories but we're not done yet next we need to convert the annotations to the YOLO format create yolow boxes key points will modify the visibility flag normalize and convert the bounding box and normalize the keyo coordinates the function Returns the bounding box and the key points and with the next function we will create the txt files here we are calling the above defined function and finally we create the YOLO text files there now we have a data set in Yolo format next data visualiz ization functions the CSV file that we downloaded earlier Maps the key points and gives us the color for each point now we need to annotate the key points and the bounding boxes on the image we write two functions for this draw landm marks and draw boxes these are basic open CV functions and python code if you're finding it difficult to follow and understand you should check out opencv University's free opencv boot camp and python boot camp go go to open cv. org university/ free courses to no more next visualize annotations is a driver function which calls draw boxes and draw landmarks and here are a few visualizations now on to training configurations so we need the data set yaml file the model we'll be training number of epo to train for which we will reduce to one for now our key Point shape which is 24 key points with x y and visibility for each then we have the project name the experiment name and the class names next is the data set configuration which is the input image size that is to be fed to the model number of batches to feed and the Transformations like mosac and flip it now create the train config and the data config objects and remember remember we provided the data set yaml file for training config Yeah we actually need to create the file so pass the data set path the train images and validation images directory the class names and the keypoint shape and dump it to the yaml file now we start training so we get the model train and we call the function model. Trin and pass all the configurations to it run this code and get yourself s a coffee or don't just stare at the screen to know more about this console log you should check out previous videos on custom Training now to evaluate we load our best trained model which is of the first Epoch and uh check the model metrics you should know that we have only trained for one Epoch it is bound to give bad results in reality you are supposed to train the model till you get your desired accuracy levels but anyway let's run a few predictions this function runs the prediction whereas this is the driver code let's check our results and they are horrible absolutely horrible I'll post better results on the screen which has been predicted with the model that has been trained for 100 epox that's all about pose estimation with yolo V8 models if you like this video why don't you check out AI for ocean cleanup with the YOLO V8 segmentation models or other videos in the YOLO masterclass playlist do comment on what you would like to see next and don't forget to subscribe thanks for watching until next time
Info
Channel: LearnOpenCV
Views: 3,827
Rating: undefined out of 5
Keywords: Computer Vision, Deep Learning, Artificial Intelligence, AI, ML, YOLOv8, Animal Pose, Pose Estimation, Ultralytics, Keypoint Detection, Dogs, Stanford Dogs Dataset, Fine-tuning, Wildlife Monitoring, Body Part Detection, computer vision tutorial, learn computer vision, computer vision basics, what is computer vision
Id: kb03ufEkOdA
Channel Id: undefined
Length: 8min 11sec (491 seconds)
Published: Mon Oct 09 2023
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.