How to Train YOLOv8 Pose Estimation on a Custom Dataset for Real-time Performance

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
Hi everyone, I'm Rama, co-founder and CEO of Theos AI, and in today's video we're going to take a look at how you can train YOLOv8 on your own custom dataset, but this time for pose-estimation instead of the usual object detection. Alright, so the first step is to sign up to Theos, obviously, so go ahead to the Theos AI website and click the Sign Up button here on the top right corner, fill in this form, and you will have a free account on Theos to follow this tutorial. Alright, here we are on the library section of Theos, and if you go here to pose-estimation, you will see the leaderboard with all the algorithms we have now. We have currently YOLOv8 and all of the versions of it, nano, small, medium, large and extra large. Basically, the one you want to choose is depending on the use case. We're going to use YOLOv8 pose-nano because it's the fastest one to process videos in real time. But if you care more about the precision of the keypoint detection, you will go to the extra large version, but you will suffer a bit in performance, in speed, sorry. So we'll be a bit slower. But with the nano, it will be okay for us. Now let's go to the Dataset section and create a new one. Let's call it People. Okay, so now we're going to drop a single image here to upload. I'm going to tell you why in a few seconds. The reason why I uploaded one image only is because now we have to create the skeleton class of the person. So for every new person that we label, it will have at first the same skeleton that then we will have to move the keypoints around to match the pose of that particular person. But to get the first skeleton right, I wanted a picture that looks like this. With a person standing in the straight position without moving much. So yeah, let's do that. Let's click Start Labeling right here on the top right. And create a new class called Person. Let's change the color here. And now we have to select the Pose Estimation tool. Let's go here and click the plus button. To add any keypoints. Let's add one here. One here. Here and here. Alright, so now here in the ears. The eyes. And finally the nose. Alright, now we have to go here and add connections between the keypoints. So let's do that. So let's go ahead and change the color for this new connection. I'll use this one. Let's add yellow for the arms. So to remind you, you have to click the first keypoint and then move the mouse to add the other keypoint you want to. The other keypoint from the connection. And then you can go from the connection and click it again. And you will form the connection between those two keypoints. So, okay. Now we have the first one labeled. And let's go ahead and confirm the skeleton. And now it disappeared because now you have to go ahead and draw a bounding box around the person. And as you can see, you already have here the keypoints. Let's go ahead and move them a bit because they are not right here. I'll have to do like that. Alright, perfect. Now we can go ahead and submit this. Let's go ahead and drop a few more images. Okay guys, so now I'm uploading 999 images to reach 1000 images in my dataset. Alright guys. Let's now label this. So, you have to label all of the images in the dataset. You have to label all of them. So, if you want to label fewer images than 1000, I recommend you to just upload 100 or 500 images. But always label all of them before training. Because if you don't do this, the AI will be confused. Because it will see that some of the images where people are indeed present don't have any labels. And it will try to predict the person there. But you are telling it that there is no person there. So, it will be confused. Literally. So, yeah. Label all of the images. If you want to see all of the shortcuts, you can click here on this button. It will show you the list of all the shortcuts we have for the dataset labeling tool. So, I'm going to press the E key now. You can press the space key to go into the hand mode while you are labeling. So, when I release the space key, now I can make another bounding box if I want to. But, let's stay here to move the key points around. alright guys so I finished labeling all of my dataset images and now I'm here on the training session of Theos so let's create a new training session called post estimation let's select our dataset now let's click here post estimation and the nano version click Next and this is the machine that will do the network training if you're not on the paid plan of Theos you can either use Google Colab that will offer you a free GPU from Google or connect your own GPU from Ubuntu Python 3.10 at the moment by running three these commands your computer will be connected to Theos and you can use it for full training so I'm going to use Google Colab so let's go ahead and click this and now we have to run these very few commands we have here so yeah let's go ahead and try the first one and this is going to tell us if we we have an NVIDIA GPU available here in the notebook and yes indeed we have one Tesla T4 GPU and now I'm installing the python package from Theos AI you can ignore this error message here there's no problem and now let's run the Theos setup here you have to select obviously Jollo post estimation all right so now we have to go ahead and restart the Colab runtime as you said here restart runtime this is because we installed a few packages that were already installed in the Colab notebook but we updated it with another version so we have to refresh the runtime all right now let's log in and now let's go here and copy the project key and paste it here perfect so now we have here the Colab note to connect it to Theos so now let's click it and click create to create the training session okay so the epochs are the number of times that the model will see all of the images in the dataset and try to predict them predict their bounding boxes and key points yeah this means that the model will see 300 times all of the images in the dataset and with each iteration it will be getting better and better and predicting those Key points and bounding boxes the batch size is a number of images that the model will seen parallel and this will cause the model to train faster but it will also require more GPU memory I will leave it like that if you want to try different uh values feel free to do it but make sure that the batch size is not big big and overflows the GPU memory basically so be careful with them that. Let's go ahead and click start training now. Okay guys so I stopped the training here because it wasn't getting much better the fitness as you can see here. The fitness is the main metric that you look at to see the performance or the overall performance of the model. So yeah now let's go ahead and deploy this model to try it out. Alright guys so now I'm here in the deploy section of Theos I created a new deployment called post estimation with the best weights of the experiment that we made. Now let's drop some images here to test the model with images that it has never seen. Here you can see the JSON output that you will receive if you use the API. In the overview you can see here if you use this URL you will be able to send images to this URL with your any programming language that you choose. So yeah if you're here to the docs you will be able to see all of the parameters that you can send to the API. Here you have a few examples how to do it with the command line. Here you have a Python example or how to call it API with an image you have in your disk and also how to use it in react. If you're making a web application or just you can also use this in react native. So yes you can see it's working quite well. We can see the model that takes the person in the image and also the key points that we labeled. These are images of people the model has never seen before so you can we can see that it's generalized very well. Let's try a few more examples. All right working quite well as you can see. All right guys with that we finished the tutorial. I hope you liked it. If you like this content please like and subscribe if you want to see more of it and consider also joining the WhatsApp group on the disco server we have called the Theos AI University where the Theos AI team and all the community members will be able to help you to succeed in creating your own videos. Thank you very much for watching and I'll see you in the next video. Bye bye
Info
Channel: Theos AI
Views: 4,198
Rating: undefined out of 5
Keywords:
Id: H7gwTJzVzVk
Channel Id: undefined
Length: 12min 22sec (742 seconds)
Published: Fri Nov 03 2023
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.