Yolov8 FULL TUTORIAL | Detection | Classification | Segmentation | Pose | Computer vision

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
YOLO V8 one of the most powerful computer  vision Technologies ever made in this video   I'm going to show you how to do object  detection image classification image   segmentation and pose detection on your own  custom data using yolo V8 I'm going to show   you the entire process how to annotate your  data how to do the training how to analyze the   performance of the model you trained and how  to use the model to make predictions this   is a full comprehensive tutorial on  yolo V8 and now let's get started hey my name is Felipe and welcome to my channel  in this video we are going to train an object   detector using yolo V8 and I'm going to walk you  step by step through the entire process from how   to collect the data you need in order to train an  object detector how to annotate the data using a   computer vision annotation tool how to structure  the data into the exact format you need in order   to use yolo V8, how to do the training and  I'm going to show you two different ways to   do it; from your local environment and also from  a Google collab and how to test the performance   ofthea model you trained so this is going to be  a super comprehensive step-by-step guide of   everything you need to know in order to train  an object detector using yolo v8 on your own   custom data set so let's get started so let's  start with this tutorial let's start with this   process and the first thing we need to do is to  collect data the data collection is the first step   in this process remember that if you want to train  an object detector or any type of machine learning   model you definitely need data, the algorithm, the specific algorithm you're going to use in this   case yolo V8 is very very important but the data  is as important as the algorithm if you don't have   data you cannot train any machine learning model  that's very important so let me show you the data   I am going to use in this process these are some  images I have   downloaded and which I'm going to use in order  to train this object detector and let me show   you a few of them these are some images of alpacas  this is an alpaca data set I have downloaded for   today's tutorial and you can see these are all  images containing alpacas in different postures   and in different situations right so this is  exactly the data I am going to use in this process   but obviously you could use whatever data set you  want you could use exactly the same data set I am   going to use or you can just collect the data  yourself you could just take your cell phone or   your camera or whatever and you can just take the  pictures the photos the images you are going   to use you can just do your own data collection  or something else you could do is to just use a   a publicly available data set so let  me show you this data set this is the open image   dataset version 7 and this is a dataset which is  publicly available and you can definitely use it   in order to work on today's tutorial in order to  train the object detector we are going to train   on todays tutorial so let me show you how it looks  like if I go to explore and I select detection   uh you can see that I'm going to unselect all  these options you can see that this is a huge   data set containing many many many many many  many many many categories I don't know how   many but they are many this is a huge data set  it contains millions of images, hundreds of   thousands if not millions of annotations thousands  of categories this is a super super huge data set   and you can see that you have many many different  categories now we are looking at trumpet and you   can see these are different images with trumpets  and from each one of these images we have a   bounding box around the trumpet and if I show you  another one for example we also have Beetle and in   this category you can see we have many different  images from many different type of beetles so   this is another example or if I show you this one  which is bottle and we have many different images   containing bottles for example there you can see  many different type of bottles and in all cases we   have a bounding box around the bottle and I could show you I don't know how many examples because   there are many many many different categories  so remember the first step in this process is   the data collection this is the data I am going  to to use in this project which is a dataset   of alpacas and you can use the exact same data  I am using if you want to you can use the same   data set of alpacas or you can just collect your  own data set by using your cell phone your camera   or something like that or you can also download  the images from a publicly available dataset   for example the open images dataset version 7. if you  decide to use open images dataset version 7 let   me show you another category which is alpaca this  is exactly from where I have downloaded all of the   images of alpacas so if in case you decide to use  this publicly available data set I can provide you   with a couple of scripts I have used in order to  download all this data in order to parse through   all the different annotations and to  format this data in the exact format we need   in order to work on today's tutorial so in case  you decide to use open image data set I am going   to give you a couple of scripts which are going to  be super super useful for you so that's that's all   I can say about the data collection remember you  need to collect data if you want to train an object   detector and you have all those different ways  to do it and all these different categories and   all these different options so now let's move on  to the next step and now let's continue with the   data annotation you have collected a lot of images  as I have over here you have a lot of images which   you have collected yourself or maybe you have  downloaded this data from a publicly available   data set and now it's the time to annotate this  data set maybe you were lucky enough when you were   creating the dataset and maybe this data set you  are using is already annotated maybe you already   have all the bounding boxes from all of your  objects from all your categories maybe that's   the case so you don't really need to annotate your  data but in any other case for example if you were   using a custom data set, a dataset you have collected  yourself with your own cell phone your camera and   so on something you have collected in that case  you definitely need to annotate your data so in   order to make this process more comprehensive in  order to show you like the entire process let me   show you as well how to annotate data so we are  going to use this tool which is CVAT this is a   labeling tool I have used it many many times in  many projects I would say it's one of my favorite   tools I have used pretty much absolutely all  the object detection computer vision related   annotation tools I have used maybe I haven't used  them all but I have used many many of them and if   you are familiar with annotation tools you would  know that there are many many of them and none of   them is perfect I will say all of the different  annotation tools have their advantages and their   disadvantages and for some situations you prefer  to use one of them and for other situations it's   better to use another one CVAT has many advantages  and it also has a few disadvantages I'm not saying   it's perfect but nevertheless this is a tool I  have used in many projects and I really really   like it so let me show you how to use it you  have to go to cvat.ai and then you select try   for free there are different pricing options  but if you are going to work on your own or or   in a very small team you  can definitely use the free version so I have   already logged in this is already logged into my  account but if you don't have an account then you   will have to create a new one so you  you're going to see like a sign up page and you   can just create a new account and then you can  just logged in into that account so once you are   logged into this annotation tool you need to  go to projects and then create a new one I'm   going to create a project which is called alpaca  detector because this is the project I am going   to be working in and I'm going to add a label  which in my case is going to be only one label   which is alpaca and then that's pretty much all  submit and open I have created the project it has   one label which is alpaca remember if your project  has many many different labels add all the labels   you need, and then I will go here which is create  a new task I am going to create a new annotation   task and I'm going to call this task something  like alpaca detector annotation task zero zero one   this is from the project alpaca detector and this  will take all the labels from that project now   you need to upload all the images you are going to  annotate so in my case I'm obviously not going to   annotate all the images because you can see these  are too many images and it doesn't make any sense   to annotate all these images in this video These  are 452 images so I'm not going to annotate them   all but I'm going to select a few in order to show  you how exactly this annotation tool works and how   exactly you can use it in your project also in my  case as I have already as I have downloaded these   images from a publicly available data set from  the open images dataset version 7 I already   have the annotations I already have all the  bounding boxes so in my case I don't really need   to annotate this data because I already have the  annotations but I'm going to pretend I don't so   I can just label a few images and I can show you  how it works so now I go back here and I'm just   going to select something like this many images  right yeah I'm just going to select this many   images I'm going to open these images and then  I'm going to click on submit and open right so   this is going to create this task and at the same  time it's going to open this task so we can start   working on our annotation process okay so this is  the task I have just created I'm going to click   here in job number and this and the job number  and this will open all the images and now I'm   going to start annotating all these images so we  are working on an object detection problem so we   are going to annotate bounding boxes we need to  go here and for example if we will be detecting   many different categories we would select what  is the category we are going to label now and and that's   it in my case I'm going to label always the same  category which is alpaca so I don't really need to   do anything here so I'm going to select shape  and let me show you how I do it I'm going to   click in the upper left corner and then in the  bottom right corner so the idea is to enclose the   object and only the object right the idea is to  draw a bonding box around the object you only want to enclose this object  and you can see that we have other animals in the   back right we have other alpacas so I'm just going  to label them too and there is a shortcut which is   pressing the letter N and you can just create  a new bounding box so that's another one this   is another one this is another alpaca and this is  the last one okay that's pretty much all so once   you're ready you can just press Ctrl s that's  going to save the annotations I recommend you   to press Ctrl S as often as possible because it's  always a good practice so now everything is saved   I can just continue to the next image now we are  going to annotate this alpaca and I'm going to do   exactly the same process I can start here obviously  you can just start in whatever corner you want   and I'm going to do something like this okay  this image is completely annotated I'm going to   continue to the next image in this case I am going  to annotate this alpaca too. this is not a real   alpaca but I want my object detector to be able  to detect these type of objects too so I'm going   to annotate it as well this is going to be a very  good exercise because if you want to work   as a machine learning engineer or as a computer  visual engineer annotating data is something   you have to do very often, actually training  machine learning models is something you have to   do very often so usually the data annotation is  done by other people, right, it is done by annotator s  there are different  services you can hire in order to annotate data   but in whatever case whatever service you use  it's always a very good practice to annotate   some of the images yourself right because if  you annotate some of the images yourself   you are going to be more familiar with the data  and you're also going to be more familiar on how   to instruct the annotators on how to annotate this  particular data for example in this case it's not   really challenging we just have to annotate these  two objects but let me show you there will be   other cases because there will be always situations  which are a little confusing in this case it's not   confusing either I have just to I have to label  that object but for example a few images ago   when we were annotating this image if an annotator  is working on this image that person is going   to ask you what do I do here should I annotate  this image or not right if an annotator is working   on this image and the instructions you provide  are not clear enough the person is going   to ask you hey what do I do here should I annotate  this image or not is this an alpaca or not so for   example that situation, another situation will be  what happened here which we had many different   alpacas in the background and some of them for  example this one is a little occluded so there   could be an annotator someone who ask you hey do  you want me to annotate absolutely every single   alpaca or maybe I can just draw a huge bonding box  here in the background and just say everything   in the background is an alpaca it's something that  when an annotator is working on the images they   are going to have many many different questions  regarding how to annotate the data and they are   all perfect questions and very good questions  because this is exactly what's about I mean when   you are annotating data you are defining exactly  what are the objects you are going to detect right   so um what I'm going is that if you annotate some  of the images yourself you are going to be more   familiar on what are all the different situations  and what exactly is going on with your data so you   are more clear in exactly what are the objects  you want to detect right so let's continue this   is only to show a few examples this is another  situation in my case I want to say that both of   them are alpacas so I'm just going to say  something like this but there could be another person   who says no this is only one annotation  is something like this right I'm just going   to draw one bonding box enclosing both of them  something that and it will be a good criteria I   mean it will be a criteria which I guess it would  be fine but uh whatever your criteria would be you need one right you need a criteria so while you  are annotating some of the images is that you   are going to further understand what exactly is  an alpaca what exactly is the object you want to   consider as alpaca so I'm just going to continue  this is another case which may not be clear but   I'm just going to say this is an alpaca this  black one which we can only see this part and   we don't really see the head but I'm going to  say it's an alpaca anyway   this one too this one too this one too also this  is something that always happens to me when I am   working when I am annotating images that I am more  aware of all the diversity of all these images for   example this is a perfect perfect example because  we have an alpaca which is being reflected on a   mirror and it's only like a very small  section of the alpaca it's only like a   very small uh piece of the alpacas face so what  do we do here I am going to annotate this one too   because yeah that's my criteria but another person  could say no this is not the object I want to detect   this is only the object I want to detect and maybe  another person would say no this is not an alpaca   alpacas don't really apply makeup on them this is  not real so I'm not going to annotate this image   you get the idea right there could be many different  situations and the only way you get familiar   with all the different type of situations  is if you annotate some of the images yourself   so now let's continue in my case I'm going  to do something like this   because yeah I would say the most important  object is this one and then other ones are like...   yeah it's not really that important if we detect  them or not okay so let's continue this is very   similar to another image I don't know how many I have  selected but I think we have only a few left I don't know if this type of animals are natural... I'm very surprised about this like   the head right it's like it has a lot of  hair over here and then it's completely   hairless the entire body I mean I don't know I'm  surprised maybe they are made like that or maybe   it's like a natural alpaca who cares who cares...  let's continue so we have let's see how many   we have only a few left so let's continue uh let's  see if we find any other strange situation which   we have to Define if that's an alpaca or not so  I can show you an additional example also when you   are annotating you could Define your bounding box  in many many different ways for example in this   case we could Define it like this we could Define  it like this I mean we could Define it super super   fit to the object something like this super super  fit and we could enclose exactly the object or we   could be a little more relaxed right for example  something like this would be okay too and if we want   to do it like this it will be okay too right you  don't have to be super super super accurate you   could be like a little more relaxed and it's  going to work anyway uh now in this last one and that's pretty much all  and this is the last one okay I'm going to do something like this now I'm  going to take this I think this is also alpaca   but anyway I'm just going to annotate this part  so that's pretty much all, I'm going to save and   those are the few images I have selected in order  to show you how to use this annotation tool so   that's pretty much all for the data annotation and  remember this is also a very important step this   is a very important task in this process because  if we want to train an object detector we need   data and we need annotated data so this is a very  very important part in this process remember this   tools cvat this is only one of the many many  many available image annotation tools, you   can definitely use another one if you want it's  perfectly fine it's not like you have to use this   one, at all, you can use whatever annotation tool  you want but this is a tool I think it's very easy   to use I like the fact it's very easy to use it's  also a web application so you don't really need   to download anything to your computer you can  just go ahead and use it from the web that's also   one of its advantages so yeah so this is a  tool I showed you in this video how to use in order   to train this object detector so this is going  to be all for this step and now let's continue   with the next part in this process and now that  we have collected and annotated all of our data   now it comes the time to format this data to  structure this data into the format we need   in order to train an object detector using yolo V8  when you're working in machine learning and you're   training a machine learning model every single  algorithm you work with it's going to have its own   requirements on how to input the data that's going  to happen with absolutely every single algorithm   you will work with it's going to happen with yolo  with all the different YOLO versions and it's   going to happen with absolutely every single  algorithm you are working with so especially yolov8 needs the data in a very specific format so  I created this step in this process so we can   just take all the data we have generated all the  images and all the annotations and we can convert   all these images into the format we need in order  to input this data into yolo V8 so let me show   you exactly how we are going to do that if you  have annotated data using cvat you have to go to   tasks and then you have to select this option and  it's export task data set it's going to ask you   the export format so you can export this data into  many different formats and you're going to choose   you're going to scroll all the way down and you're  going to choose YOLO 1.1 right then you can also   save the images but in this case it's not really  needed we don't really need the images we already   have the images and you're just going to click ok  now if you wait a few seconds or a few minutes if   you have a very large data set you are going to  download a file like this and if I open this file   you are going to see all these different files  right you can see we have four different files so   actually three files and a directory and if I open  the directory this is what you are going to see   which is many many different file names and if I  go back to the images directory you will see that   all these images file names they all look pretty  much the same right you can see that the file name   the structure for this file name looks pretty  much the same as the one with as the ones we have   just downloaded from cvat so basically the way  it works is that when you are downloading this   data into this format into the YOLO format every  single annotation file is going to be downloaded   with the same name as the image you have annotated  but with a different extension so if you have an   image which was called something.jpg then The  annotation file for that specific image will be   something.txt right so that's the way it works  and if I open this image you are going to see   something like this you're going to see in this  case only one row but let me show you another   one which contains more than one annotation I  remember there were many for example this one   which contains two different rows and each one of  these rows is a different object in my case as I   only have alpacas in this data set each one of  these rows is a different alpaca and this is how   you can make sense of this information the first  character is the class, the class you are detecting   I wanted to enlarge the entire file and  I don't know what I'm doing there okay okay the first number is the class you are  detecting in in my case I only have one so   it's only a zero because it's my only class and  then these four numbers which Define the bounding   box right this is encoded in the YOLO format which  means that the first two numbers are the position   of the center of the bounding box then you have  the width of your bounding box and then the   height of your bounding box, you will notice  these are all float numbers and this basically   means that it's relative to the entire size of  the image so these are the annotations we have   downloaded and this is in the exact same format  we need in order to train this object detector   so remember when I was downloading these  annotations we noticed there were many many many   different options all of these different options  are different formats in which we could save the   annotations and this is very important because you  definitely need to download YOLO because we are   going to work with yolo and everything it's pretty  much ready as we need it in order to input into   yolo V8 right if you select YOLO that's exactly  the same format you need in order to continue with   the next steps and if you have your data into  a different format maybe if you have already   collected and annotate your data and you have your  data in whatever other format please remember you   will need to convert these images or actually to  convert these annotations into the YOLO format   now this is one of the things we need for  the data this is one of the things we need in   order to we need to format in order to structure  the data in a way we can use this object detector   with yolo V8 but another thing we should do is  to create very specific directories containing this   data right we are going to need two directories  one of them should be called images and the other   one should be called labels you definitely need  to input these names you cannot choose whatever   name you want you need to choose these two names  right the images should be located in an directory   called images and the labels should be located in  a directory called labels that's the way yolo V8   works so you need to create these two directories  within your image directory is where you are going   to have your images if I click here you can  see that these are all my images they are all   within the images directory they are all within  the train directory which is within the images   directory this directry is not absolutely needed  right you could perfectly take all your images all   these images and you could just paste all your  images here right in the images directory and   everything will be just fine but if you want you  could do something exactly as I did over here and   you could have an additional directory which is  in between images and your images   and you can call this whatever way you want this  is a very good strategy in case you want to have   for example a train directory containing all the  training images and then another directory which   could be called validation for example and this  is where you are going to have many images in   order to validate your process your training  process your algorithm and you could do the   same with an additional directory which could be  called test for example or you can just use these   directories in order to label the data right  to create different versions of your data which is   another thing which is very commonly done so you  could create many directories for many different   purposes and that will be perfectly fine but you  could also just paste all the images here and   that's also perfectly fine and you can see that  for the labels directory I did exactly the same we   have a directory which is called train and within  this directory is that we have all these different   files and for each one of these files let me  show you like this it's going to be much better for each one of these files for each one of  these txt files we will have an image in the   images directory which is called exactly the  same exactly the same file name but a different   extension right so in this case this one is called  .txt and this one is called .jpg but you can   see that it's exactly exactly the same file name  for example the first image is called oa2ea8f   and so on and that's exactly the same name as  for the first image in the images directory   which is called oa2ea8f and so on so basically for  absolutely every image in your images directory   you need to have an annotations file and a file in  the labels directory which is called exactly the   same exactly the same but with a different extension  if your images are .jpg your annotations files   are .txt so that's another thing which also  defines the structure you'll need for your data   and that's pretty much all so remember you need  to have two directories one of them is called   images, the other one is called labels within the images  directories is where you're going to have all your   images and within your labels directories is where  you will have all your annotations, all your labels   and for absolutely every single image in your  images directory you will need to have a file   in the labels directory which is called exactly  the same but with a different extension if   your images are .jpg your annotation files should  be .txt and the labels should be expressed in   the yolo format which is as many rows as  objects in that image and every single one   of these rows should have the same structure you  are going to have five terms the first one of them   is the class ID in my case I only have one class  ID I'm only detecting alpacas so in my case this   number will always be zero but if you're detecting  more than one class then you will have different   numbers then you have the position the X and Y  position of the center of the bounding box and   then you will have the width and then you will  have the height and everything will be expressed   in relative coordinates so basically this is  the structure you need for your data   and this is what this step is about so that's  pretty much all about converting the data or about   formatting the data and now let's move on to the  training now it's where we are going to take all   this data and we are going to train our object  detector using yolo V8 so now that we have   taken the data into the format we need in order to  train yolo v8 now comes the time for the training   now it comes the time where we are going to take  this custom data set and we are going to train an   object detector using yolo V8 so this is yolo  V8 official repository one of the things   I like the most about YOLO V8 is that in order  to train an object detector we can do it either   with python with only a few python  instructions or we can also use a command line   utility let me see if I find it over here we can  also execute a command like this in our terminal   something that looks like this and that's pretty  much all we need to do in order to train this   object detector that's something I really really  liked that's something I'm definitely going to   use in our projects from now on because I think  it's a very very convenient and a very easy way   to train an object detector or a machine learning  model so this is the first thing we should notice   about yolo V8 there are two different ways  in which we can train an object detector we   can either do it in python as we usually do or  we can run a command in our terminal I'm going   to show you both ways so you're familiar with both  ways and also I mentioned that I am going to show   you the entire process on a local environment in a  python project and I'm also going to show you this   process in a google colab so I I know there are  people who prefer to work in a local environment I   am one of those people and I know that there are  other people who prefer to work on a Google colab   so depending on in which group are you I  am going to show you both ways to do it so you   can just choose the one you like the most so let's  start with it and now let's go to pycharm this is   a pycharm project I created for this training and  this is the file we are going to edit in order to   train the object detector so the first thing I'm  going to do is to just copy a few lines I'm just   going to copy everything and I'm going to remove  everything we don't need copy and paste so we want   to build a new model from scratch so we are going  to keep this sentence and then we are going   to train a model so we are just going to remove  everything but the first sentence and that's all   right these are the two lines we need in order to  train an object detector using yolo V8 now we are   going to do some adjustments, obviously the  first thing we need to do is to import   ultralytics which is a library we need to use in  order to import yolo, in order to train a yolo   V8 model and this is a python Library we need to  install as we usually do we go to our terminal and   we do something like pip install and the library  name in my case nothing is going to happen because   I have already installed this library but please  remember to install it and also please mind that   when you are installing this Library this library  has many many dependencies so you are going to   install many many many many different python  packages so it's going to take a lot of space   so definitely please be ready for that because you  need a lot of available space in order to install   this library and it's also going to take  some time because you are installing many many   many different packages but anyway let's continue  please remember to install this library and these   are the two sentences we need in order to run  this training from a python script   so this sentence we're just going to leave it as  it is this is where we are loading the specific   yolo V8 architecture the specific yolo V8  model we are going to use you can see that   we can choose from any of all of these different  models these are different versions or these are   different sizes for yolo V8 you can see we have  Nano small medium large or extra large we are   using the Nano version which is the smallest one  or is the lightest one, so this is the one we are going   to use, the yolo V8 Nano, the yolo V8 n then about  the training about this other sentence we need to   edit this file right we need a yaml file which  is going to contain all the configuration for our   training so I have created this file and I have  named this file config.yaml I'm not sure if this   is the most appropriate name but anyway this is  the name I have chosen for this file so what I'm   going to do is just edit this parameter and I'm  going to input config.yaml this is where the   config.yaml is located this is where the main.pi  is located, they are in the same directory so if I do   this it's going to work just fine and then let  me show you the structure for this config.yaml   you can see that this is a very very very simple  configuration file we only have a few Keys which   are PATH train val and then names right let's  start with the names let's start with this this   is where you are going to set all your different  classes right you are training an object detector   you are detecting many different categories many  different classes and this is where you are going   to input is where you're going to type all of  those different classes in my case I'm   just detecting alpacas that's the only class  I am detecting so I only have one class, is the   number zero and it's called alpaca but if you are  detecting additional objects please remember to   include all the list of all the objects you are  detecting, then about these three parameters these   three arguments the path is the absolute path to  your directory containing images and annotations   and please remember to include the absolute path.  I ran some issues when I was trying to specify   a relative path relative from this directory from  my current directory where this project is created   to the directory where my data is located when  I was using a relative path I had some issues   and then I noticed that there were other people  having issues as well I noticed that in the GitHub   repository from YOLO V8 I noticed this is in the  the issues section there were other people having   issues when they were specifying a relative path  so the way I fixed it and it's a very good way to   fix it it's a very easy way to fix it it's just  specifying an absolute path remember this should   be an absolute path so this is the path to this  directory to the directory contain the images   and the labels directories so this is this is the  path you need to specify here and then you have to   specify the relative path from this location to  where your images are located like the specific   images are located right in my case they are in  images/train relative to this path if I show you   this location which is my root directory then if  I go to images/train this is where my images are   located right so that's exactly what I need to  specify and then you can see that this is the   train data this is the data the algorithm is going  to use as training data and then we have another   keyword which is val right the validation dataset  in this case we are going to specify the   same data as we used for training and the reason  I'm doing this is because we want to keep things   simple in this tutorial I'm just going to show  you the entire process of how to train an object   detector using yolo V8 on a custom data set  I want to keep things simple so I'm just going   to use the same data so that's pretty much all  for this configuration file now going back to   main that's pretty much all we need in order to  train an object detector using yolo V8   from python that's how simple it is so now I'm  going to execute this file I'm going to change the   number of epochs I'm going to do this for only  one Epoch because the only thing I'm going to   show you for now is how it is executed, I'm going to  show you the entire process and once we notice   how everything is working once we know everything is up and running everything is   working fine we can just continue but let's just  do this process let's just do this training for   only one Epoch so we can continue you can see that  now it's loading the data it has already loaded  the data you can make use of all the different  information of this debugging   information we can see here you can see now  we were loading 452 images and we were able   to load all the images right 452 from 452 and if  I scroll down you can see that we have additional   information additional values which are related  to the training process this is how the training   process is going right we are training this object  detector and this additional information which we   are given through this process so for now the  only thing we have to do is only waiting we   have to wait until this process is completed so  I am going to stop this video now and I'm going   to fast forward this video until the end of this  training and let's see what happens okay so the   training is now completed and you can see that  we have an output which says results saved to   runs/detect/train39 so if I go to that directory  runs/detect and train39 you can see that we have   many many different files and these files are related to how the training process   was done right for example if I show you these  images these are a few batches of images which   were used in order to train this algorithm  you can see the name is train batch0   and train batch1 I think we have a train batch2 so we have a lot of different images of a lot   of different alpacas of different images we used  for training and they were all put together they   were all concatenated into these huge images so  we can see exactly the images which were used for   training and The annotation on top of them right  the bonding boxes on top of them and we also have   similar images but for the validation dataset right remember in this case we are using the same   data as validation as we use for training so it's  exactly the same data it's not different data but   these were the labels in the validation data set  which is the training data set and these were the   predictions on the same images right you can see  that we are not detecting anything we don't have   absolutely any prediction we don't have absolutely  any bounding box this is because we are doing a   very shallow training we are doing a very dummy  training we are training this algorithm only for one epoch  this was only an example to show you the output  how it looks like to show you the entire process   but it is not a real training but nevertheless  these are some files I'm going to   show you better when we are in the next step  for now let me show you how the training is done   from the command line from the terminal using the  command I showed you over here using a command like   this and also let me show you how this training  is done on a Google colab so going to the terminal   if we type something like this yolo detect train  data I have to specify the configuration file   which is config.yaml and then model yolov8n.yaml   and then the number of epochs this   it's exactly the same as we did here exactly the  same is going to produce exactly the same output   I'm just going to change the number of epochs for  one so we make it exactly the same and let's see   what happens you can see that it we have exactly  the same output we have loaded all the images and   now we are starting a new training process and  after this training process we are going to have   a new run which we have already created the new  directory which is train40 and this is where   we are going to save all the information related  to this training process so I'm not going to do   it because it's going to be exactly the same as  as the one we did before but this is exactly how   you should use the command line or how you  can use this utility in order to do this training   from the terminal you can see how simple it is  it's amazing how simple it is it's just amazing   and now let me show you how everything is done  from a Google colab so now let's go back to the   browser so I can show you this notebook I created  in order to train yolo V8 from a Google colab   if you're not familiar with Google collab the way  you can create a new notebook is going to Google   Drive you can click new more and you select  the option Google collaboratory this is going   to create a new google colab notebook and you  can just use that notebook to train this object   detector now let me show you this notebook and  you can see that it contains only one two three   four five cells this is how simple this will  be the first thing you need to do is to upload   the data you are going to use in order to train  this detector it's going to be exactly the same   data as we used before so these are exactly  the same directories the images directory and   the label directory we used before and then  the first thing we need to do is to   execute this cell which mounts Google Drive into  this instance of google collab so the only   thing I'm doing is just I just pressed  enter into this cell and this may take some time   but it's basically the only thing it does is  to connect to Google Drive so we can just access   the data we have in Google Drive so I'm going to  select my account and then allow and that's pretty   much all then it all comes to where you have the  data in your Google drive right in the specific   directory where you have uploaded the data in  my case my data is located in this path right   this is my home in Google Drive and then this  is the relative path to the location of where   I have the data and where I have all the files  related to this project so remember to specify this   root directory as the directory where you have  uploaded your data and that's pretty much all   and then I'm just going to execute this cell  so I save this variable I'm going to execute   this other cell which is pip install ultralytics the  same command I ran from the terminal in my local   environment now I'm going to run it in Google  collab remember you have to start this command by   the exclamation mark which means you are running  a command in the terminal where this process is   being executed or where this notebook is being  launched so remember to include the exclamation   mark everything seems to be okay everything  seems to be ready and now we can continue to   the next cell which is this one you can see that  we have done exactly the same structure we have   input exactly the same lines as in our  local environment if I show you this again you   can see we have imported ultralytics then we have  defined this yolo object and then we have called   model.train and this is exactly the same as we are  doing here obviously we are going to need another   yaml file we are going to need a yaml file in our  Google Drive and this is the file I have specified   which it's like exactly the same  configuration as in the um as in the in the   yaml file I showed you in my local environment is  exactly the same idea so this is exactly what you   should do now you should specify an absolute  path to your Google Drive directory that's the   only difference so that's the only difference  and I see I have a very small mistake because   I see I have data here and here I have just  uploaded images and labels in the directory   but they are not within another rectory which  is called Data so let me do something I'm going   to create a new directory which is called Data  images labels I'm just going to put everything   here right so everything is consistent so now  everything is okay images then train and then the   images are within this directory so everything  is okay now let's go back to the Google collab   every time you make an edit or every time you do  something on Google Drive it's always a good idea to   restart your runtime so that's what I'm going  to do I'm going to execute the commands again   I don't really need to pip install this Library  again because it's already installed into this   environment and then I'm going to execute this  file I think I have to do an additional edit which   is uh this file now it's called google_colab_config.yaml and that's pretty much all I'm just going   to run it for one Epoch so everything is exactly  the same as we did in our local environment and   now let's see what happens so you can see that  we are doing exactly the same process everything   looks pretty much the same as it did before we  are loading the data we are just loading the   models everything it's going fine and  this is going to be pretty much the same process   as before you can see that now it takes  some additional time to load the data because now   you have... you are running this environment you're  running this notebook in a given environment and   you're taking the data from your Google Drive so  it takes some time it's it's a slower process but   it's definitely the same idea so the only thing we  need to do now is just to wait until all this uh   process to be completed and that's pretty much all  I think it doesn't really make any sense to wait   because it's like it's going to be exactly the  same process we did from our local environment   at the end of this execution we are going to have  all the results in a given directory which is the   directory of the notebook which is running this  process so at the end of this process please   remember to execute this command which is going  to take all the files you have defined in this   runs directory which contains all the runs you  have made all the results you have produced and   it's going to take all this directory  into the directory you have chosen for your files   and your data and your google collab and so on  please remember to do this because otherwise   you would not be able to access this data and  this data which contains all the results and   everything you have just trained so this is how  you can train an object detector   using yolo V8 in a Google collab and you can  see that the process is very straightforward and   it's pretty much exactly the same process exactly  the same idea as we did you in our local environment   and that's it so that's how easy it is to train  an object detector using yolo Y8 once you have   done everything we did with the data right once  you have collected the data you have annotated   data you have taken everything into the format  yolo V8 needs in order to train an object   detector once everything is completed then  running this process running this training   is super straightforward so that's going to be  all about this training process and now let's   continue with the testing now let's see how these  models we have trained how they performed right   let's move to the next step and this is the last  step in this process this is where we are going   to take the model we produced in the training  step and we're going to test how it performs   this is the last step in this process this is how  we are going to complete this training of an object   detector using yolo v8, so once we have trained  a model we go to the uh to this directory remember   to the directory I showed you before regarding... the  directory where all the information was saved   where all the information regarding this training  process was saved and obviously I I'm not going to   show you the training we just did because it was  like a very shallow training like a very dummy   training but instead I'm going to show you the  results from another training I did when I Was   preparing this video where I conducted exactly the  same process but the training process was done for   100 epochs so it was like a more deeper training  right so let me show you all the files we have   produced so you know what are all the different  tools you have in order to test the performance of   the model you have trained so basically you have  a confusion Matrix which is going to give you a   lot of information regarding how the different  classes are predicted or how all the different   classes are confused right if you are familiar  with how a confusion Matrix looks like or it   should look like then you will know how to read  this information basically this is going to give   you information regarding how all the different  classes were confused in my case I only have   one class which is alpaca but you can see that  this generates another category which is like   uh the default category which is background and we  have some information here it doesn't really say   much it says how these classes are confused but  given that this is an object detector I think the   most valuable information it's in other metrics in  other outputs so we are not really going to mind   this confusion Matrix then you have some plots  some curves for example this is the F1 confidence   curve we are not going to mind this plot either  remember we are just starting to train an   object detector using yolo V8 the idea for this  tutorial is to make it like a very introductory training   a very introductory process so we are not going to  mind in all these different uh plots we have over   here because it involves a lot of knowledge and  a lot of expertise to extract all the information   from these plots and it's not really the idea for  this tutorial let's do things differently let's   focus on this plot which is also available in  the results which were saved into this directory   and you can see that we have many many many  different plots you can definitely go crazy   analyzing all the information you have here  because you have one two three four five   ten different plots you could knock yourself out  analyzing and just extracting all the information   from all these different plots but again the idea  is to make it a very introductory video and a very   introductory tutorial so long story short I'm  just going to give you one tip of something the   one thing you should focus on these plots for now  if you're going to take something from this video   from how to test the performance of a model  you have just trained using yolo v8 to train an object   detector is this make sure your loss is going  down right you have many plots some of them are   related to the loss function which are this one this  one and this one this is for the training set and   these are related to the validation set make  sure all of your losses are going down right this   is like a very I would say a very simple way to  analyze these functions or to analyze these plots   but that's... I will say that that's more powerful  that it would appear make sure all your losses are   going down because given the loss function we  could have many different situations we could   have a loss function which is going down which  I would say it's a very good situation we could   have a loss function which started to go down and  then just it looks something like a flat line and   if we are in something that looks like a flat line  it means that our training process has stuck so it   could be a good thing because maybe the the  algorithm the machine learning model really   learned everything he had to learn about this  data so maybe a flat line is not really a bad   thing maybe I don't know you you would have to  analyze other stuff or if you look at your loss   function you could also have a situation  where your loss function is going up right   that's the other situation and if you my friend  have a loss function which is going up then you   have a huge problem then something is obviously  not right with your training and that's why I'm   saying that analyzing your loss function what  happens with your loss is going to give you a   lot of information ideally it should go down if  it's going down then everything is going well   most likely, if its something like a flatline  well it could be a good thing or a bad thing I   don't know we could be in different situations  but if it's going up you have done something   super super wrong I don't know what's going on  in your code I don't know what's going on in   your training process but something is obviously  wrong right so that's like a very simple and a   very naive way to analyze all this information  but trust me that's going to give you a lot a   lot of information you know or to start working  on this testing the performance of this model   but I would say that looking at the plots and analyzing  all this information and so on I would say that's   more about research, that's what people  who do research like to do and I'm more like   a freelancer I don't really do research so  I'm going to show you another way to analyze this   performance, the model we have just  trained which from my perspective it's a more...   it makes more sense to analyze it like this and it  involves to see how it performs with real  data right how it performs with data you have  used in order to make your inferences and to   see what happens so the first step in this more  practical more visual evaluation of this model of   how this model performs is looking at these images  and remember that before when we looked at these   images we had this one which was regarding the  labels in the validation set and then this other   one which were the predictions were completely  empty now you can see that the the predictions   we have produced they are not completely empty  and we are detecting the position of our alpacas   super super accurately we have some mistakes  actually for example here we are detecting   a person as an alpaca here we are detecting also  a person as an alpaca and we have some missdetections   for example this should be in alpaca and it's not  being detected so we have some missdetections but you   can see that the the results are pretty much okay  right everything looks pretty much okay the same   about here if we go here we are detecting pretty  much everything we have a Missdetection here we   have an error over here because we are detecting  an alpaca where there is actually nothing so things are   not perfect but everything seems to be pretty much  okay that's the first way in which we are going to   analyze the performance of this model which is  a lot because this is like a very visual way to   see how it performs we are not looking at plots we  are not looking at metrics right we are looking at   real examples and to see how this model performs  on real data maybe I am biased to analyze things   like this because I'm a freelancer and the way it  usually works when you are a freelancer is that   if you are building this model to deliver this  project for a client and you tell your client oh   yeah the model was perfect take a look at all  these plots take a look at all these metrics   everything was just amazing and then your client  tests the model and it doesn't work the client   will not care about all the pretty plots and so  on right so that's why I don't really mind a lot   about these plots maybe I am biased because I am a  freelancer and that's how freelancing works but I   prefer to do like a more visual evaluation  so that's the first step we will do and we   can notice already we are having a better  performance we are having an okay performance   but this data we are currently looking at right  now remember the validation data it was pretty   much the same data we use as training so this  doesn't really say much I'm going to show you   how it performs on data which the algorithm have  never seen with completely and absolutely unseen   data and this is a very good practice if you  want to test the performance of a model, so I have   prepared a few videos so let me show you these  videos they are basically... remember this is   completely unseen data and this is the first video  you can see that this is an alpaca which is just   being an alpaca which is just walking around  it's doing its alpaca stuff it's having an   alpaca everyday life it's just being an alpaca  right it's walking around from one place to   the other doing uh doing nothing no it's doing  its alpaca stuff which is a lot this is one of   the videos I have prepared this is another video  which is also an alpaca doing alpaca related stuff   um so this is another video we are going to  see remember this is completely unseen data and   I also have another video over here so I'm  going to show you how the model performs on these   three videos I have made a script in Python  which loads these videos and just calls the   predict method from yolo v8, we  are loading the model we have trained and we are   applying all the predictions to this model and  we are seeing how it performs on these videos   so this is the first video I showed you and these  are the detections we are getting you can see   we are getting an absolutely perfect detection  remember this is completely unseen data and we are   getting I'm not going to say 100 perfect detection  because we're not but I would say it's pretty good   I will say it's pretty pretty good in order to  start working on this training process uh yeah   I would say it's pretty good so this is one of  the examples then let me show you another example   which is this one and this is the other video  I showed you and you can see that we are also   detecting exactly the position of the alpaca  in some cases the text is going outside of the   frame because we don't really have space but  everything seems to be okay in this video too   so we are taking exactly the position of this uh  alpaca the bonding box in some cases is not really   fit to the alpaca face but yeah but everything  seems to be working fine and then the other video   I showed you you can see in this case the detection  is a little broken we have many missdetections   but now everything is much better and yeah in  this case it's working better too it's working   well I would say in these three examples this one  it's the one that's performing better and then the   other one I really like how it performed too in  this case where the alpaca was like starting its   alpaca Journey... we have like a very  good detection and a very stable detection then it   like breaks a little but nevertheless I would say  it's okay it's also detecting this alpaca over   here so uh I will say it's working pretty much  okay so this is pretty much how we are going to   do the testing in this phase remember that if you  want to test the performance of the model you have   just trained using yellow V8 you will have a lot  of information in this directory which is created   when you are yolo the model at the end of your  training process you will have all of these files   and you will have a lot of information to knock  yourself out to go crazy analyzing all these   different plots and so on or you can just keep it  simple and just take a look at what happened with   the training loss and the validation  loss and so on all the loss functions make sure   they are going down that's the very least thing  you need to make sure of and then you can just   see how it performs with a few images or with  a few videos, take a look how it performs   with unseen data and you can make decisions from  there maybe you can just use the model as it is   or you can just decide to train it again in this  case if I analyze all this information I see that   the loss functions are going down and not  only they are going down but I notice that there   is a lot of space to to improve this training, to  improve the performance because we haven't reached   that moment where everything just appears to be  stuck right like that a flat line we are very far   away from there so that's something I would do  I would do a new deeper training so we can just   continue learning about this process also I w  change the validation data for something that's   completely different from the training  data so we have even more information and that's   pretty much what I would do in order to iterate in  order to make a better model and a more powerful model hey my name is Felipe and welcome to my  channel in this video I'm going to show you   how to make an image classifier using Yolo  V8 on your own custom data I'm going to show   you every single step of this process from how  to organize the data so it complies with Yolo V8   how to do the training in your local computer  and also from a Google Colab how to validate the   performance of the model you trained and finally  how to take the image classifier in order to   make new predictions I'm going to show you the  entire process this is going to be an amazing   tutorial and now let's get started so on today's  tutorial I'm going to show you how to train an   image classifier using yolo V8 on your own  custom data set so let's get started and the   first thing I'm going to do is to show you the  data I am going to use in this tutorial which is   a weather related dataset let me show you the  different categories we have and let me show   you all the different images how they look like  we have four different categories and they are   cloudy, rain, shine and sunrise now let me show you  each one of these categories for example the cloudy   category this is how the images look like you  can see that in each one of these images we have a   sky which is completely cloudy right we have many  different clouds for each one of these images now   the sunrise category it's basically many different  pictures of sunrises so this is how this category   look like and now for the shine category we have a  sky which is completely completely clear and with   a super super bright sun right you have the sun  in each one of these images and it's super super   bright and this is the rainy category and you can  see these are many different pictures of super   rainy days so this is basically the dara set I am  going to use in this tutorial but obviously you   can apply absolutely everything I'm going to show  you today to absolutely any type of data set you   are going to be able to build any type of image  classifier with everything I'm going to say in   this tutorial now let me show you the structure  you need for your data because if you're going   to train an image specifier or if you're going  to use yolo V8 yes the data is super super   important but you also need to structure to give  like a format to all of your data so it complies   with the way yolo V8 expects your data to be  right yolo V8 requires your data to be in a   given format in a given structure so I'm going  to show you exactly how to structure your file   system so everything looks the way it shloud to train  an image classifier using yolo V8 so if I show   you I have a directory which is called weather  data set this is going to be the root directory   you can call this directory whatever you want but  you need a directory which is going to be your   root directory and inside this directory you can  see we have two different folders one of them is   called train and the other one is called val and  this is exactly where you are going to have your   training dataset and your validation dataset right  it's very important you name these directories   exactly like this one of them should be called  train and the other one val now if I show you within   the train directory this is where we are  going to have our four directories containing   all the different images for all of our categories  basically you need to have as many directories as   categories you want to classify with your model so  in my case I want to classify an image into four   different categories and this is why I have four  different directories   each one of these directories is named as the category  I want to classify my images in right when one   of them is called cloudy the ther one  is called rain then shine and then sunrise and   these are the categories I want to classify all my  images and then within these directories, these   folders is where I have all my data within cloudy  is where I have all my data related to the Cloudy   category and so on right the same happens for  the rain and the shine and the sunrise category   so this is basically the structure you need for  your data the structure you need for your file   system in order to comply with what yolo v8 is expecting for your data and then if I go   to the val folder you can see I have exactly the  same structure I have four different directories   and they are named under the categories I want  to classify all my images and then if I open   this directory it's exactly the same you can  see that I only have different images for that   specific category now this is very important  because from now on everything is going to be   super super straightforward if you have created  this structure for your file system if your data   is exactly in the structure I show you there  is going to be super simple to train an image   classifier in yolov8 so this is very very  very important now I'm going to show you three   different ways in which you can train an image  classifier using yolo V8 so let's start with the   first way which is using a python script we are  going to make a very very simple script in Python   in order to train this model and let me show  you how to do it so let's go to pycharm this is   a pycharm project I created for todays tutorial and the first thing you should do  if you want to work with yolo V8 is to install a couple of dependencies a couple of python packages  these are two packages we are going to use in   this tutorial one of them is ultralytics and the  other one is numpy, ultralytics is very very very   very super important because this is exactly the  library you need in order to import yolo, in   order to train this model using yolo V8 so you  definitely need these two packages, now in order to   install these packages this is how we are going to  do it I'm going to show you a way to install these   packages which is going to work with whatever your  OS right if you are a Linux user or if you are a   Windows user if you use Mac it doesn't matter it's  going to work anyway so you need to go to file   then settings and then you have to select python  interpreter right this is the python interpreter   we are going to use you can see that I'm using  python 3.8 and then you need to click on plus   and this is where you're going to find... you're  going to search for the packages you want to   install in my case I'm going to search for  ultralytics and the version I'm going to use   let me copy the version first it's this one so I'm  just going to file setting then Ultralytics again   and then the version is this one okay and then I  click on install package in my case I have already   installed this dependency so nothing is going  to happen on my computer but please remember   to do it on your computer because otherwise you  will not be able to use or you're not going to   be able to do anything of what we are going  to be doing today now let's see numpy what's   exactly the version we are going to use we're  going to use 1.24.2 so file settings Plus numpy 1.4 24.2 so everything is okay now install package  this is like uh everything it's okay numpy has   been installed successfully so now we are ready  to continue once we you have installed these two   dependencies these two packages now you're ready  to continue and now you're ready to install your   own image classifier using yolo V8 so let's  go to main this is the file we are going to   use in order to code everything we need in order  to train this classifier and let me show you    exactly what's the code you need to type in order  to do this training I'm going over here to the   GitHub repository of yolo V8 and I'm going to  select the classification section right I'm going   for the classification and then here I'm going  to click on classification docs this is going   to open a new file a new URL a new website a new  page and this is exactly the all the information   we need in order to train this image classifier  I'm just going scroll down and I'm going to the   train section and this is what we're going to do I'm  going to copy and paste this line which is the one   in the middle the one that says load a pre-trained  model recommended for training I'm just going to   copy and then I'm going back to Pycharm and I'm  just going to paste it obviously we need to import   yolo otherwise this is not going to work  so I'm going to say from ultralytics import YOLO and that's pretty much all you can see now we  are creating our model, we are creating the   object we are going to use as our model and  then I'm just going to copy and paste this last   line which is model.train I'm going to paste it  here and then I'm going to make a few edits I'm   going to leave this value I'm going to leave  the image size in 64 but then for the number of   epochs I'm going to set it in 1 right because  the first thing we're going to do is we're going   to do a very very dummy training in order to make  sure everything works as expected in order to make   sure everything works properly and once we are  completely and 100% sure everything is okay we are   going to move forward with a more deeper training  and with a more real training right but for now   let's just do the training for one Epoch and let's  see how it goes then for data this is where you're   going to specify the absolute part to the data  you are going to train this model with right in   my case it's going to be this weather dataset so I'm  just going to copy and paste the absolute path of   this data set which is this... I'm going to copy this  path and I'm going to paste it here right this is the  data I am going to use remember that you need to  specify the absolute path to the root directory   of your data and remember you need to structure  your data into the exact format that I already   mentioned right otherwise this is not going to  work and that's everything we need in order to   train this image classifier so the only thing I'm  going to do is to press play I'm going to run this   script so let's see what happens remember we are  running this training we are doing this process   for only one Epoch because we need to make sure  everything works properly and once everything   is working properly we are just going to edit  this value we're going to make this training   for more epochs but you can see everything seems  to be working properly so everything seems to be   okay and everything seems to be completed and  everything seems to be ready so that's it and   you can see that the results have been saved here  in run/classify/train12 so let me show you   exactly where this directory where this location  is in my file system if I go to the project the   pycharm project I created into my file system  this is exactly the project I created this is the   file we are currently working in the main.py file  and this is where my data is located and this is   where the runs directory the runs folder will be  located this is where it will be created you   can see that within runs we have another directory  which is called classify and here is where you   will have many many many folders for each one of  your training processes and you can see that in   my case I have trained this classifier many many  many different times while I Was preparing this   video so there are many directories for me but  this is exactly the one which was just created   the train12 right train12 this is exactly  the directory which was just created and if   I open this directory you can see we have another  directory and then we have two files I'm going to   explain what exactly all these different files and  all these different folders are and exactly what's   the information we have in all these files but  I'm going to do it later on this tutorial when   we are validating this training process right  for now just remember all the results will be   saved here will be saved within  this folder within the runs folder and then within   classify and then a new directory a new folder  will be created for the training process you have   just executed right this is something you need to  remember for now but later on this tutorial I'm   going to show you exactly how you can validate the  training using the information that's within this   directory but for now let's continue I'm going  to show you now a different way in which you can   train this image classifier using yolo V8 I'm  going to do... I'm going to show you how to do it   using the command line using this utility and this  is actually like a very very straightforward way   to do this training let me show you you can see that  we have three different examples I'm just going to   select this one I'm going to copy and paste this  instruction this line and I'm going to show you   how it I'm just going to paste it here and you can  see that we have many different parameters right   in the first word is yolo this is the utility  we are going to execute then classify this is   the task we are going to execute we are going to  train an image classifier and then we are going   to train it so we need... we have another keyword  which is train and then we have these arguments   data model and epochs and also image size I'm  going to do exactly the same with image size   I'm just going to leave this value in 64 but then  I'm going to edit all the other values so actually   I'm going to the number of epochs and I'm  also going to edit data for the number of epochs   let's do something similar I'm just going to do  it for one epoch so we make sure everything runs   smoothly and everything runs properly and then  we can do like a more serious training a more   real training for more epochs this is exactly  the model I'm going to use so I'm not going to   edit this keyword either and then I'm going to  edit this argument and I'm just going to say   this is the absolute path to my data so this is  going to be exactly the same as I have over here something like this okay and that's pretty much  all the only thing I need to do now I'm going to   copy and paste this sentence and I'm just going  to a terminal and I'm going to do something like   this right I'm just copying I have just copy and  paste that sentence and you can see that that's   all we need to do in order to train this image  classifiers using yolov8 you can see that the model...   the training process has started and everything  is running super super smoothly so everything is   going super super well that's all right that's a  very very quick way and a very straightforward way   to do this training you can see the training has  just been completed and this is exactly where the   results have been saved to runs/classify/train13,  so everything is completed everything is ready   you can see how simple how fast is to train an  image classifier just by running this command   now I'm going to show you another way to do this  training which is using a google colab we are going   to use a Jupiter notebook we are going to use a  notebook in a Google collab in order to train this   model and this is also like a very good way to do  it so let me show you how to do it so basically   you need to go to google drive you need to go to  your Google Drive you need to select new then more   google collaboratory and this is going to  open a new notebook this is going to open   a new notebook in Google Colab and  is exactly what you need to do in order   to use this notebook to train yolo v8  now I'm going to show you a notebook   I have already created in order to train  this model which is this one is called train.ipymb and obviusly I'm going to give you  exactly this notebook in the GitHub repository   of today's video of today's tutorial so you can  just use this notebook if you want now I'm going   to show you all these different cells everything  that's already writen on this notebook so   you can... so you understand how exactly to use it  and how it works and what exactly you are doing   at each step so let's start with the first step  another thing you need to do if you want to   train this image classifier is to upload all  the data with all the images and with all your   categories into Google Drive obviously for  example in my case this is where I have my   weather data set you can see that this directory  is exactly this same directory I have over here   weather data set within weather data set there  are two directories which are train and val   if I if I open this directory you can see we  also have traiin and val so this is exactly   exactly the same data as in my local computer now  this is something very important because remember   to do it because you need the data in your Google  Drive in order to train this model using a Google   collab this is a very very important step please  remember to upload your data into Google Drive   now once your data is in Google Drive then you  need to be able to access your data from the   Google collab and in order to do that you need  to execute this cell if I click enter you can see   that now I'm going to be asked if I want  to connect Google collab with Google Drive and the   only thing I need to do is to say I accept there  you can see that it's requesting for my permission   I say connect to Google Drive and then   I select my account and then basically is to   scroll down to the bottom of this page and to  click allow and it's going to allow Google collab   to access all the data you have in your Google  Drive so this is a very very very important step   now something that's very important is that  you need to be able to access your data so you need   to know where your data is located in the Google  drive right you need to know exactly what's the path   what's the location of your data in Google Drive  in my case let me show you my Google Drive you   can see that my data is located into a directory  this is my root directory which is my drive then   I have another directory which is called computer  vision engineer then another directory which is   image classification yolo V8 and then data and  then this is where my weather data set is located   in your case it's going to be different obviously  it depends on where exactly you have uploaded your   data so something you may want to do is just to  click this... ls you can say something like ls and   then you say something like content my Gdrive  my drive right you execute this command and if I   execute this command you're going to see a very  very long list of files which are basically all   the files which are in my root directory in  Google Drive and for example this is where   I have the directory which is called computer  video engineer and if I do ls you're going to   see all these different directories if I say  something like image classification yolo V8   then this is data train.ipymb which is  exactly this notebook and then if I say data this is exactly where the weather data set is  located right so do something like that because   you definitely need to know what is the path of  your data in Google collab right you definitely   need to do it in order to continue to The  Next Step this is very important because if   you haven't set your data properly if your data  location is not set properly then yolo V8 will   not be able to train your model this is very very  important so in my case this is exactly where the   data the weather data set is located right this is  the path to the weather dataset so this is the the cell   I am going to execute and this is the value I'm  going to save in this value in data dir now   I'm going to continue then we need to pip install  ultralytics which is the library we need in order   to train this model in order to use yolo V8 now  the only thing you need to do is to execute this   cell and everything will run super smoothly you  can see that we have already completed this   process now I'm going to continue and the only  thing we need to do now is to execute this cell   and you can see that the code we have in this  cell is very very similar to the code we have   over here right basically we are running a python  script from a Google collab that's all we're doing   so you can see we are importing OS and also we  are importing the YOLO Library we are importing   from ultralytics we're importing yolo and  then we are doing exactly the same as we are   doing before and this is where we are using the  data directory the data dir variable we have   defined over here right so this is why it's  very very important you set this variable properly   so the only thing I'm going to do... I'm going to  do exactly the same as before I'm just going to   do this training for only one Epoch so we make  sure everything's okay I'm going to press enter   and that should be it in order to do all  this training the first time you execute   this training it may take a little longer because  you are downloading all the weights and you're   downloading the models and everything but uh after  that everything should be much much quicker okay   so you can see that now the training process is  in progress everything is going super super   well and from now on the only thing we need to do  is to edit the number of epochs so we do like a   more deeper training but I will say everything is  working super super properly so now let's move to   the other cells so I show you what exactly you  need to do once everything is completed once   everything is completed the only thing you need to  do is to run this cell so you are copying you are   going to copy all your results which were saved  on this directory you're going to copy everything   on your Google drive right because remember you  are working on a Google colab you're working   on an environment which is your Google collab  environment if you don't do something like this   it's going to be super super hard for you to get  the data you have just trained right to get your   results to get your model your weights is going to  be super super hard because everything is located   in your Google collab environment and long story  short is going to be much much simpler and much   much better if you just do something like this  and you just copy everything all the results which   were saved in this directory into your Google  Drive it's going to be much much better because   it's going to be much easier to download the weights  to download the results and so on so now I'm just   going to wait a couple of minutes so everything is  completed over here and then I can show you how to   copy the results into your Google Drive okay now  the training process has been completed and you   can see that the results have been saved into runs  classify train so this has a very similar output   to the one we just noticed when we were training on  our local environment now the only thing we need   to do is to copy everything into our Google Drive  so everything is much much simpler if you want   to download these results or to do whatever we  want so the only thing I'm going to do is to run   this cell and everything will be copied into this  directory which is the same directory where I have   my data and where I have my my Google collab right  now you can see that everything has been copied   already this is the directory I have just copied  this is the time this is the current time so this   is the result of the cell I have just executed and  if I go to runs classify train you can see that   these are all the results we have generated this  is the CSV file containing many different results   which I'm going to show you in a few minutes  and these are the weights and so on so from now   on if we want to get this data or if we want to  analyze this data the only thing we need to do is   to select runs and then we just need to click  download and it is going to download all this   directory into your local drive right you can see  everything is being zipping and once everything   is zipped this directory will be downloaded  into my local computer and you can see that this   directory has just been downloaded so everything  is working just fine now this is pretty much all in   order to show you three different ways in which  you can train an image classifier using yolo V8   and now let's do the deeper training right I'm  just going to take this script and I'm going to   edit the number of epochs so we do this training  for something like 20 epochs I have already been   doing some tests and 20 epochs is just enough for  this dataset for the data set I am using in this   tutorial so 20 will be just fine now the only  thing we need to do is to click on run I'm just   going to run this script as it is and everything  will be exactly the same as before everything will   be exactly the same right we are just we just need  to wait until this process is completely we don't   need to do anything from now on but this process  will be executed for 20 epochs so the only thing   I'm going to do is to wait until this process is  completed and once everything is completed we are   going to validate this training process I'm going  to show you how to analyze if all this process was   done successfully or not if you have successfully  trained a good image classifier or not so I'm   just going to pause the recording here and I'm  going to fast forward until this is completed   okay so the training process has been completed  and now let me show you all the results which   were saved here into runs classify and train14  now let me show you this directory this   folder in my local computer if I go to runs  classify and then train14 this is where all   the results have been saved and this is everything  we are going to analyze now, now we are going to   decide if the model we have trained is a good  model or not we are going to decide if this   is a model we can use or not so you can see that  there are two files args.yaml and results.csv and   another directory called weights let's start  by args.yaml if I open this file you can see   that this is something like a config file and  this is exactly the entire configuration file   which we have just used in order to train this  model this is very important because this is a   super super comprehensive list of all the hyper  parameters we have used in order to train this   model and for example the only parameters we have  specified are image size number of epochs and then   data the location of the data we have just used  and you can see that we have a keyword which is   data then epochs then image size and then we  have many many many other keywords as well   this is very important because these are  absolutely all the keywords we have used   we have used all these default values which were  set for all these different keywords and this is   important in case we want to train a new model  and we want to make some changes into some of   these hyper parameters now let me show you the  other file which is the results.csv file I would    say this is much more important this is like the  file containing all the information we need   in order to decide if this is a good model or not  and you can see that we have many different rows   each row for one of our training epochs right we  have trained this model for 20 epochs and you can   see that we have 20 rows for each one of these  epochs and for each one of these rows we have all   this different information and we are going to  focus on these three values on the training lose   the accuracy, this is the accuracy of the  validation set and then also the validation   loss right these are the three keywords in which  we are going to focus on this tutorial in order   to validate this model and I'm going to give you  like a very very quick tip like a very quick way   in order to analyze this training process which  is make sure the training loss and the validation   loss are going down through this training process  and also make sure the accuracy goes up and I know   you're thinking hey this is a very simple way  to analyze this process felpe yeah I agree with   you this is a very simple way but at the same  time it's very robust this is like a a very   simple but at the same time very powerful way to  decide if you have a good model or not now we can   analyze all these numbers but I think it's going  to be much much better and it's going to be much   much prettier if we make a plot with all these  numbers right because we have epochs in this   um in this column in this coordinate and  we also have all these different values   and we can definitely plot these values across all  these different epochs so let me show you a python   file I have created and this is exactly what  this python file does this file is called   plot_metrics and if I open this file you can see that  it basically we need to set the path to our results.csv   file in our case I'm going to set it to  train14 and you can see this is run/classify/train14   and thenresults.csv and then this is only  like some logic some very simple logic to take all   the data from these results.csv file and to do  some plots with it right that's all we are doing   we're just taking the data and doing some plots  and this file will be available in the GitHub   repository of this project of this tutorial so you  can definitely take this file and you can just use   it to to plot your functions as well all I'm going  to do now is just press play and you can see that   if we wait only a few seconds we get all these  two plots right and this is all the information   in our CSV file right everything I showed you  over here it's summarized on these two plots   so this is exactly what I mean with make sure your  loss is going down this is your    loss in the training set and in the validation  set in the training set we are plotting the loss   in blue and in the validation set is red and you  can see that in both cases the loss is going down   right which is exactly what we expect it's exactly  what we want now this is a very very simple way to   analyze this process but trust me this is also a  very powerful way right this is something that's   very very healthy something that looks like  this it's very healthy and then for this other   plot which is how the validation accuracy evolves  through this training process you can see that the   evaluation accuracy goes up when we increase the  number of epochs right you can see that starting   from the 10th Epoch or so everything starts to be  like somehow iddle right we are not really gaining   a lot of accuracy from here but we are not losing  accuracy either right we are just in something   like a plateau and this this is exactly how a   validation accuracy plot should look like right we   are starting from a very low value and then we  are just increasing our accuracy until we reach   a very high value of accuracy right this is like  a very healthy training process now obviously we   could make this process even better if we just  tune if we just change some of these parameters   and if we do like a more customized training I'm  sure we are... we will be able to have a better model   right because remember we are using all the default  values so as it usually goes if we make like a   more customized training and  we try different parameters and so on we should   be able to get like a better model but obviously  we're not going to do it in this tutorial because   I just wanted to show you like the end-to-end of  how to train this image classifier but remember   you could do it even better than this if you make  like a more custom model so this is pretty much   all for analyzing these plots which are the  validation accuracy and the loss function in   order to validate your training and then it's like  this directory which is the weights directory   you can see that this directory is called weights  and this is exactly where the models will be saved   this is very important because you have trained  a model and now obviously you want this model in   order to use it in your images in your data and  this is exactly where you are going to find this   model and you can see that you have two different  files one of them is called last.pt another one   is called best.pt now let me explain exactly what  these two files are and exactly what they mean so   remember how this training process works right  remember that you have a model you have a deep   learning model which is comprised of many many  many different weights and the way it goes is   that at the end of every Epoch right at the end of  the first Epoch of the second epoch of the third   epoch and so on you are updating the weights  of your model you are updating the weights of   your architecture of your deep learning model so  the way it works is that at the end of every   Epoch you have a model available which is a model  you have trained so far with all the process you   have followed so far so last.pt means that  you are taking the model which was the result of   the last Epoch of your training process right  remember at the end of absolutely every single   Epoch you have a model available which you can  definitely use if you want to in order to produce   your inferences and so on, so last.pt only  means that you are taking the last Model the model   which was produced at the end of your training  process at the end of the last Epoch in your   training process so at the end of the 20th Epoch  in our training process we are producing this model   Which is last.pt but you may Wonder hey Felipe  yeah it's great because at the end of our   training process our accuracy is something  like a 93% right a 93% it's   a very good accuracy but if we take the accuracy  if we take the model at the end of the 16th Epoch   for example our accuracy it's higher it's a  94.9 % maybe it makes more sense to take   that model instead right because we have an even  better accuracy we have an even higher accuracy   and if you ask me something like that I would say  yeah you're perfectly right you're you're super   super right that's a very valid argument and  that's exactly what the best.pt model is right   we are saving the weights of the best model in  our entire training process so if we look at   our data the best model in our training process is  this one if I'm not mistaken right it's the model   we produced at the end of the 16th Epoch and our  accuracy our validation accuracy was 94.9%   so this is definitely higher than the accuracy  we got at the end of this training process   which has which was a 93.5 and if we will take  the best model we have produced in the entire   process in the entire training process then  we will definitely need to take this model so   this is exactly what best.pt represents is the  best training the best model you have trained   in your training process and if you ask me what I  usually do is take in the model which was produced   at the end of the training process right what  I usually do is take in the last,pt file   because I consider that if this is a model we  have produced at the end of the training process   in this model we are summarizing much more  information right because we are considering   much more data we are considering much more  everything in all this training process many   things are going on many many things are going  on and remember there's a lot of Randomness in   this training process so I, me, personally  I consider that if I take the model which was   trained at the end of this process is a much  better option that if I choose a another one   if I choose like the best model or the model which  got the highest accuracy but it's not the last   Model that's what I usually do I usually take  the last model which was produced at the end of   the training process but if you want to take the  best model if you want to take best.pt  it also makes sense because you are taking the model  which produced the highest accuracy right   so you can do either one of them and I think it's  a very a good option that's why you have these   two files because you can use one of them or you  can use the other one and I would say that making   like a very very like the best decision on which  mode to use depends on many different variables   depends on many different things depends on your  data depends on your problem depends on your use   case depends on your training process  depends on many many different things which is   the best option right so remember you have these  two models and it's all up to you it's all up to   your specific project and it's all up to your  preferences which model you want to use right   if the best model which you have produced through  the entire training process or if you want to use   the last Model the model which you have produced  at the end of your training process so now let's   go back to pycharm because now it's time to make  our inferences now it's time to predict new    samples right and we are going to input an image  and we're going to use our image classifier in order to   predict which category this image belongs to so  let me show you how to do it I'm going to import   from ultralytics import YOLO and then  let's go back to this page because now we   are going to move to the predict section and  the only thing I'm going to do is to copy   this sentence... going to paste it here and then  I'm going to specify the path the absolute path   to the model which we have trained right  we don't really need to make it like the   absolute path we can use the relative path  so I'm going to do something like this right   sorry something like this so this is the path to  the model we have just trained right this is the   last model which we produce at the end of this  training process and this is the model I'm going   to use in order to show you how this works and  now let's copy this additional sentence which is   results = model and the model path the image  path right you can see that you can use an image   in your local computer in your file system or you  can also use something like an URL for example   in this case in this example which is in the  yolo V8 website you can see that the example is   using an URL and this is also going to work so  in my case I'm going to use an image in my local   computer I'm going to use one of the images I used  for training because I only want to show you how   this works but obviously you can use whatever  data whatever image you want so this is the   image I am going to use I'm just going to use I'm  just going to inference this image right which is   the first image in my Sunrise category data so  this is going to be something like sunrise1.jpg and this is pretty much all so these are  the results the first thing I'm going to   do is just trying to run this code and let's see  what happens everything should run smoothly but   this is where we are going to see if we have an  error or something like that we may need to wait a   couple seconds and everything seems to be working  fine because we didn't get an error so what I'm   going to do now is I'm going to print results  because I want to show you a couple of things   so this is the entire information we are getting  when we are printing results right you can see   that this is a lot of information we have these  probabilities which is the inferences we are   making this is exactly the result of applying  our image classifier and then we have a lot of   information another object or another result  which is very important is this one which are   the names of the categories we have just trained  our image classifier on right you can see this is   cloudy rain shine sunrise and also you can see  that we have different integer values for each one   of these categories so this is something like a  dictionary because we are going to have a result   from applying our image classifier and then  with this result which is going to be an integer   we are going to call this dictionary we're  going to call this object because we want to   know exactly what's the name of the category we  have just inferenced right so this is how we're   going to do it I'm going to call another variable  which is going to be names something like   names dictionary names_dict and this is results  zero because results is a list in this case   we only want to access the first element because  we are only predicting an individual image so this   is the element we want and then we are  going to call Dot names and that's pretty much   all then I'm going to Define another variable  which is props and this is results 0 dot props   and this is the probability Vector of all the  different categories we are trying to classify   right so we are going to have a length 4 array  with the probabilities of the different classes   we are classifying right so let me show you how  props looks like I'm going to print props and I'm   going to do something else I'm going to say to  list so we make this object into a list we are   using yolo which is based on pytorch so if we  don't do this if we don't call this method we will   be working with a torch object right with a tensor  so we don't really want to do that so that's why   I'm doing this tolist now I'm going to print  props so I show you how it looks like and I'll   show you how to continue from here okay you can  see that this is a result we got from applying from   printing props and you can see that this is a list  with four elements one two three and four and each   one of these elements are the probabilities  of this image to be one of these categories   right let's print the names too so we have all  the information in our screen I want to show   you I want to show you something so I'm going to  print sorry this wasn't names this was names dict and now let's wait a couple of seconds I want  to show you not only the probabilities but   also the class names so it's a little more clear what  exactly I'm going to show you now so this means   that this number is the probability for this  image to be cloudy right this other number is   the probability for this image to be rain this  other number is probability to be shine and   then this last number is the probability to be  sunrise and you can see by the values that we   are definitely classifying this image as Sunrise  right because this is almost a one this is almost   like a super super confident and absolutely  confident classification so this is exactly   the category we are classifying for this image and  this is how to make sense of this information so   what I'm going to do now is to print names dicts  and then I'm going to call np dot arg max and then I'm going to input the probability list  I just showed you and obviously I need to import numpy as np otherwise is not going to work and  basically what we are doing here is that we   are looking at this list the one containing all  four probabilities we are taking a look at the   maximum number which in this case is this one and  we are taking the index of this maximum number so   in this case this is the first element so this is  the index 0 this is one this is two and this is   three right so from this um from calling np dot  arg max props we are getting three and then we   are calling the third element of the names_dicts  object so we go here and we see that 3 belongs to   the sunrise category and if we look at this image  again we are going to see we are in fact plotting   a sunrise let me show you so everything seems to  be working fine and this is going to be all for   today this is exactly how you can train an image  classifier using yolo V8 in your own custom data   and this is going to be all for this tutorial  so in my previous videos I showed you how  to train an image classifier and an object   detector using yolo V8 now is the time for  semantic segmentation I'm going to show you   the entire process of how to train a semantic  segmentation algorithm using yolo V8 from   how to annotate the data how to train  the model in your local environment and   also from a google colab and finally a super super  comprehensive guide on how to validate the model   you trained my name is Felipe welcome  to my channel and now let's get started   so let's start with tpday's tutorial and the first  thing I'm going to do is to show you the data   we are going to be using today so this is a  dataset I have prepared for today's tutorial and   you can see that these are images of ducks we are  going to be using a duck dataset today and this is   exactly how the images look like now for each one  of our images for absolutely every single one of   our images we are going to have a binary mask we  are going to have an image a binary image where   absolutely every single Pixel is either white or  black and absolutely every single white pixel it's   the location of our objects all the white pixels  are the location of the objects we are interested   in in this case the objects are our Ducks so let  me show you an example so it's a little more   clear what I mean regarding the white pixels are  the location of our objects so this is a random   image in my data set this a random image of a duck  and this exactly its binary mask so take a look   what happens when I align the these two images  and when I apply something like a transparency   you can see that the binary mask is giving us the  exact location of the duck in this image so this   is exactly what it means that the white pixels are  the location of our objects so this is exactly the   data I am going to using in this tutorial and now  let me show you from where I have downloaded this   data set this a dataset I have found in the  open images dataset version 7. let me show you   this dataset super super quickly this is an  amazing dataset that you can use for many different   computer vision related tasks for example if I  go to segmentation you can see that we have many   many many different categories now we are looking  at a random category of phones this is for example   a semantic segmentation data set of phones and  let me show you if I go here and I scroll down   you can see that one of the categories is here  duck so this is for example the exact same data   I am going to be using in this tutorial this is  the exact same duck dataset I am going to be   using in order to train a semantic segmentation  algorithm using yolo V8 and obviously you could   download the exact same data I am going to use in  this tutorial if you go to open images dataset   version 7 you can just download the exact same  duck dataset I am going to be using today or you   can also download another dataset of other categories  so this is about the data I am going to use in   this project and this is about where you can  download the exact same data if you want to, now   let me show you a website you can use in order  to annotate your data because in my case I have   downloaded a dataset which is already annotated  so I don't have to annotate absolutely any of my   images absolutely all of my images already have  its binary masks right I already have the masks   for absolutely all the images in my data set but  but if you're building your data set from scratch   chances are you will need to annotate your images  so let me give you this tool which is going to   give you which is going to be super super useful  in case you need to annotate your data it's called   cvat and you can find it in cvat.ai and this is  a very very popular computer vision annotation   tool I have used it I don't know how many times  in my projects and it's very very popular and   it's very useful so I'm going to show you how to  use this tool in order to annotate your images   so the first thing we need to do is to go to  start using cvat this is going to ask you to   either register if you don't have an user already  or to login right I already have an user so   this is logged into my account and now let me show  you how I'm going to do in order to annotate a few   images actually I'm going to annotate only one  image because I am only going to show you how to   use it in order to create a binary mask for your  project but I'm just going to do it with only one   image because it's yeah you only need to see the  process and that's going to be all so I'm going   to projects I'm going to here to the plus button  create a new project the name of this project will   be duck semantic sem seg this will be the name  of my project and it will contain only one label   which is duck so I'm going to press continue  and that's pretty much all submit and open now I'm going to create a task this is already  yeah create new task the task name will be duck task zero one it doesn't really matter the name so  I just I just selected a random name then I'm   going to add an image I'm just going to select  this image I'm just going to annotate one image   so this is going to be enough and submit and open  so this is going to take a couple of seconds this   is where you need to select all of your images  all the images you want to annotate but in my   case I'm only going to select one so I'm going to  press here in Job so this is going to open The   annotation job right now I'm going to show you how  you can annotate this image how you can create a   binary mask for this image you need to go here to  draw new polygon then shape so I'm going to start   over here and this is pretty much all we need to  do in order to create this semantic segmentation   data set for this image right in order to create  the binary mask for this image you can see that   I'm just trying to follow the Contour of this  object and you may notice that the Contour   I am following is not perfect obviously this is  not perfect and it doesn't have to be perfect   if you're creating a dataset if you are creating  the mask of an image if you are creating the   mask of an object then it definitely doesn't need  to be pixel wise perfect right you need to make a   good mask obviously but something like this as  I am doing right now will be more than enough   so this is a very time consuming process you  can see and this is why I have selected only   one because if I do many many images it's going  to take me a lot of time and it doesn't make any   sense because the idea is only for you to  see how to annotate the images right so you can   see that I'm following the Contour okay and this  is an interesting part because we have reached the   duck's hand or its leg or something like that  this part of the duck's body and you can see   that this is beneath the water this is below the  water and this is where you're going to ask   yourself do you need to annotate this part or not  do you need to annotate this part as if it's part   of the duck or not because you could say yeah  it's definitely part of this duck but you are not   really seeing a lot of this object right it's like  part of the water as well so this is where you're   going to ask yourself if you need to annotate this  part or not and in my case I'm going to annotate   it but it's like you can do either way in all of  those sections in all of those parts where you are   not 100% convinced then that's like a discussion  you could do it you could not do it it's up to you   so annotating a few images is always a good  practice because you are going to see many many   different situations as I have just seen over  here right where I have just seen with this part   of the duck which now I am super super curious  what's the name if you know what's the name   of this part of the duck's body please let  me know in the comments below I think it's   called hand right because it's something like  a hand they have over there but let me know if   it has another name and you if you know  it please let me know in the comments below   now let's continue you can see I'm almost there  I have almost completed the mask of this duck   now I only have to complete this  peak or whatever it's called   it seems I don't really know much about ducks  Anatomy I don't really know what is the name of this   part either so anyway I have already completed  and once I am completed I have to press shift N and   that's going to be all so this is the mask this  is a binary mask I have generated for this object   for this duck and this is going to be pretty  much all what I have to do now is to click save   you can see that this is this is definitely not  a perfect mask this is not a perfect like pixel   wise perfect mask because there are some parts  of this duck which are not within the mask but it   doesn't matter make it as perfect as possible but  if it's not 100% perfect it's not the end of the   world nothing happens so I have already saved this  image and what I need to do now is to download   this data so I can show you how to download the  data you have just annotated in order to create   your data set so this is what I'm going to do  I'm going to select this part this option over   here and I'm going to export task data set and  then I'm going to select this option which is   segmentation mask 1.1 I'm just going to  select that option and I'm going to click   ok so that's going to be all we only  need to wait a couple of minutes and   that's pretty much all the data has been  downloaded now I'm going to open this file   and basically the images you are interested in  are going to be here right you can see in my   case I only have one image but this is where  you're going to have many many many images   and please mind the color you will get all these  images in right in my case I have a download this   image in red it doesn't really matter just mind  that you could have something different than   white but once you have all your images what  you need to do is to create a directory I'm   going to show you how I do it I am going maybe  here and I'm going to create a very temporal   directory which I'm going to call tmp and this  is where I'm going to locate this image right   and I am going to I'm going to create two  directories one of them is going to be masks and then the other one is going to be called  labels and you're going to see why in only a   minute and this is where I'm going to locate  the mask here and then I am going to pycharm   because I have created a script a python script  which is going to take care of a very very very   important process we have created masks which are  images which are binary images and that's perfect   because that's exactly the information we need in  order to train a semantic segmentation algorithm   but the way yolo V8 works we need to convert  this image this binary image into a different   type of file we are going to keep exactly the  same information but we are going to convert   this image into another type of file so let  me show you how this is a phyton file I have   created in order to take care of this process  and the only thing you need to do is to edit   these fields this is where you're going to put  all the masks this is a directory which is going   to contain all the masks you have generated  and this is going to be the output directory   you can see that these two variables are already  named properly in my case because this is the tmp   directory I have just created this is where  I have located the mask I have just generated   with cvat and this is my output directory  so take a look what happens when I press play   so the script has just been executed everything  is okay this is the mask I have input and   this is the file which was generated from this  mask and this looks super super super absolutely   crazy right it's a lot of numbers it's like  a very very crazy thing without going to the   details let's just say that this is exactly the  same information we had here this is exactly   exactly the same information we have here but in  a different format let's let's keep the idea right   exactly the same information in a different format  and that's exactly the format yolo V8 needs   in order to train the semantic segmentation  model so this is exactly what you need to do   once you have created all of your masks you need  to download these files into your computer and   then please execute this script so you can  convert your images into a different type of files   and obviously this script will be available in  the GitHub repository of today's tutorial so   that's pretty much all in order to create  your annotations in order to download these   annotations and in order to format everything the  way you should now let me show you the structure   you need to format the way you need to structure  all of your file system so it complies with   yolov8 remember this is something we have already  done in our previous tutorials regarding yolov8   once you have your data you need to structure  your data you need to format your data you need to   structure your file system so yolo V8 finds  absolutely where everything is located right   you're going to locate your images in a given  directory you're going to locate your annotations   your labels in another directory so everything  is just the way yolov8 expects it to be   right so let me show you I have a directory  which is my root directory which is called Data   within data I have three directories but this  directory the Masks directory is not really   needed it's just there because that's the way I  got my masks my data but it's not really needed   in order to show you this directory which is the  one containing all of my binary masks in order   to be more clear that this is not needed for this  part of this process what I'm going to do is I'm   going to delete this directory right now it's  gone okay now we only have two directories and   these are exactly the directories we need in this  part of this process where we are creating all   the structure for our data so images you can see  that we have two directories one of them is called   images the other one is called labels within  images we have two other directories one of them is   called train and the other one is called val and  train is the directory where we are going to   have all of our training data this is where we are  going to have all of our training images these are   all the images yolo V8 is going to use in order  to train the model in order to train the semantic   segmentation model then val also contains images  and these are the images we are going to use in   order to validate the model right so remember you  need to have two directories one of them should be   called train is very important the name it should  be called train and the other one should be called val   now going back you can see that we have two  directories one of them is images the other one   is labels and if I go within labels you can see  that there are two directories also they are named   train and val and if I open these directories  these are the type of files I have generated with   the exact same script I showed you a few minutes  ago so within labels we have two directories   train and val and train are all the annotations  we have generated from the training data from the   training masks right and long story short we have  our root directory within the root directory have   two directories one of them is called images  the other one is called labels within images   we have two directories train and val within  train and within val it's all of our data all   of our images and within labels it's exactly the  same structure two directories train and val and   within train and within val it's where we locate  all of our annotations right that's exactly the   structure you need for your data please remember  to structure your file system like this otherwise   you may have an issue when you are trying to train  a semantic segmentation model using yolo V8   so that's pretty much all in order how to  structure the data and now let's move to   the interesting part let's move to the most fun  part which is training this semantic segmentation   model now let's move to pycharm and I will show  you how to train it from your local environment   so let's continue this is a pycharm project I  created for today's tutorial please remember to   install this project requirements otherwise you  will not be able to use yolo V8 now let's go   to train.py this is a python script I created and  this is where we are going to do all the coding we   need in order to train the semantic segmentation  model using yolo V8 and now let's go back to the   yoloV8 official repository because let's see  how exactly we can use this YOLO this model in   order to train this semantic segmentation model  I'm going to the segmentation section and I'm   going to click on segmentation Docs now this  is going to be very very straightforward I'm   going to train I'm going to copy this sentence  which is load a pre-trained model and then going   back to pycharm I'm just going to copy paste and  then I am going to from ultralytics import YOLO then I'm also going to copy this sentence  which is a model.train I'm going to change   the number of epochs of 2 something like one  because remember it's always very very healthy   it's always a very good idea to do like a very  dummy training to train the model for only one   Epoch to make sure everything is okay to make  sure everything runs smoothly and then you do   like a more more deeper training so I'm going  to change the number of epochs and then I'm   also going to change the config file I'm going  to use this config file which is a config file   I have over here and obviously you will find this  config file in the repository of today's video so   long story short you can see that you  have many many different keywords but the only   one that you need to edit is this one right this  is the absolute path to your data in my case if   I copy and paste this path over here you can see  that this is the directory which contains the   images and the labels directories so long  story short just remember to edit this path to   the path to the location of your data because  if you have already structured everything in   the way I mentioned in the way I show you in  this video then everything else will be just   fine right the train and the val keywords are  very good as it is I mean you can just leave   everything as it is but please remember to edit  this field which is the location of your data   now going back to train.py this is pretty much  all we need to do in order to train the semantic   segmentation model so I'm just going to press  play and let's see what happens and you can see   everything is going well we are training our model  but everything it's taken forever everything is   just going to take forever even though we are only  training this model for only one Epoch everything   is going to take a lot of time so what I'm going  to do instead is just press stop I'm going to stop   this training everything is going well I'm not  stopping this training because I had an error   or something no everything is going well but I am  going to repeat the exactly the same process from   a Jupiter notebook in my Google collab because  if I use Google collab I'm going to have access   to a free GPU and it's going to make the process  much much much much faster so I am going to use   a google colab in order to train this model and  I recommend you to use a Google collab as well   so I'm going to show you how to do it from  your Google collab environment please remember   to upload your data before doing anything in  Google colab please remember to upload your   data otherwise it's not going to work for example  here you can see I have many directories one   of these directories is data and within data you  have labels and images and these are exactly the   same directories I have over here so I have  already uploaded my data into my Google Drive   please remember to do it too otherwise you will  not be able to do everything we're going to do   just now right so that's one of the things you  need to upload and then also remember to upload   this config.yaml file the same file I showed you  in my local computer you also need this file here   the only thing you will need to edit is this  path because now you need to specify the path   the location of your data in Google Drive so I'm  going to show you exactly how to locate your data   into your Google Drive and now let's move to the  Jupiter notebook obviously I'm going to give you   this notebook this is going to be in the GitHub  repository of today's tutorial so you can just   use this notebook I'm just going to show you how  to execute absolutely every single cell and how   everything works right and exactly how everything  exactly what everything means right exactly what   are you doing absolutely every single cell so the  first thing I'm doing is just connecting   my Google collab environment we google drive because  remember we need to access data from Google Drive   so we definitely need to allow google collab to  access Google Drive so I'm just going to select   my account and then I scroll all the way down and  press allow that's going to be pretty much all we   need to wait a couple of seconds and now let's  continue what I'm going to do now is to Define   this variable which is data dir and this is  the location of my data in my Google Drive now   please mind this path this location because please  mind the way this is structured right please mind   the first word is content then gdrive then my  drive and then is my relative path to my data so   if you want to know exactly where you have upload  your data if you're not completely sure where you   have uploaded your data what you can do is to do  an ls like I'm doing right now and it's going   to give you all the files you have in the root  directory of your Google Drive then from there   just navigate until the directory where you have  uploaded your data in my case is my drive computer vision   engineer image segmentation yolo V8 and then  data that's exactly where my data is located in   Google Drive if I go to this directory you can see  that this is my drive then you can see that this   is my drive then computer vision engineer image  segmentation yolo V8 and then data and this   is exactly what I have over here so once you have  located your data the only thing you need to do is   to edit this cell and to press enter so everything  is ready now I'm going to install ultralytics so   I can use yolo V8 from The Notebook and this  is going to take a few seconds but this is going   to be ready in no time, something you need to  do from your Google colab is to go to runtime   and change runtime type just make sure it  says GPU just make sure you are using Google   collab with GPU because if you are not using a  google collab with GPU everything is pretty much   pointless right so just remember to check if  you are using a Google colab with GPU or not just   do it before you start all this process because  otherwise you will need to run absolutely   everything again so let's continue I have already  installed ultralytics and now I am going to   run this cell and if you realize this is exactly  exactly the same type of information the same code   I have over here in my local environment right I'm  just defining a model and then I am just training   this model so what I need to do now is just  press enter and also mind that I have specified   the config file right the location of my config  file and now I'm going to run a full training or   actually I'm going to run a training for 10 epochs  so this is what I'm going to do and this is also   going to take some time although we are going to  use a GPU this is going to take a few minutes as   well so what I'm going to do now is just I'm going  to wait until this is completed and I'm going to   pause my recording here and I'm just going to fast  forward this video until this process is completed   okay so the training process is now completed we  have trained this model and everything is just   fine and you can see the results have been saved  here under runs segment and train2. so the only   thing we need to do now is to get the results we  got from this training we need to get the weights   we need to get all the results all the different  metrics or the different plots because what we   need to do now is to analyze this training process  we need to validate that everything is just fine   right so what we are going to do now is to get  all this information and the easiest way to do   it is just running this command what we will do  when running this command we are going to copy   all the content in the this directory  where the results have been saved under our   Google Drive right remember to edit this URL, remember  to edit this path, this location, because   you want to copy everything into a directory into  your Google Drive so just make sure everything is   okay make sure this location makes sense and you  can just execute this cell and you're going to   copy everything into your Google Drive now let me  show you my Google Drive I have already executed   this cell so everything is under my Google Drive  this is the runs directory which was created when I ran   that cell and under this other directory which is  called segment we have train2 so these are all   of our results these are the results we are now  going to analyze so what I'm going to do now is   just to download this directory and once we have  this directory into our local computer then we are   going to take a look at all the plots at all the  metrics and I'm going to tell you exactly what   I usually do in order to validate this training  process so everything is now downloaded everything   is now completed and let's take a look at these  files so what I'm going to do is I'm just going to   copy everything into my desktop I need to do some  cleaning by the way so these are all the results   we got from this training process you can see that  this is a lot of information this is definitely   a lot of information right we have many many  different files we have many different everything   we have the weights over here we have a lot of  information so let me tell you let me give you   my recommendation about how to do this evaluation  how to do this validation from all these plots and   from all of these results I would recommend you  to focus on two things one of them is this plot   one of them is all of these metrics  and then I'm also going to show you how to take   a look at these results at these predictions  from these images but for now let's start here   you can see that this is a lot of information  these are a lot of metrics and you can   definitely knock yourself out analyzing all the  information you have here you can definitely go   crazy analyzing all of this information all  of these plots but I'm going to show you like   a very very simple and a very straightforward way  to do this analysis to do this validation this is   something that I have already mentioned in my  previous videos on yolo V8 on how to train a   model and how to validate this model which is take  a look what happens with the loss function take a   look what happens with your loss plots with all  the plots which are related to the loss function   and as this is a semantic segmentation type  of algorithm I would tell you take a look what   happens with this loss with the segmentation loss  I would say take a look what happens with the training loss and the validation loss and long story short just make sure the loss  function goes down right if your loss function is   going down it's likely things are going well it's  not a guarantee maybe things are not really going   that well and the model it doesn't really perform  that well it may happen but I would say that if the   loss function is going down it's a very good sign  if at the contrary your loss function is going up   I would say you have a very very serious problem  I would say there is something which is seriously   wrong with your training process or with your  data or with your annotations or with something   you have done something seriously wrong or there's  something seriously wrong with your data but I'm   talking about something amazingly wrong seriously  wrong right if your loss solution is going up   I don't know what's going on but something is  going on do you see what I mean so having a   loss function which is going down yeah it's not  a guarantee of success I mean it's not like it's   a good model for sure no you may have a  situation where you haven't trained a good model   and your loss function is going down anyway but I  would say that it's a very very good sign at the   very least your training loss and your validation  loss should go down and I'm talking about a trend   of going down right for example here we have a  few epochs in which the loss function is going up   that's okay that's not a problem we are looking  for a trend we should have a trend for the loss   function to go down and that's exactly what  we have in this situation so long story short   that's my recommendation on how to do this this  validation how to do this analysis on all the   metrics we have over here for now focus on these  two and make sure they are going down and then   in order to continue with this process with  this validation is that we are going to take   a look at what happens with our predictions how  is this model performing with some data with some   predictions and for this we are going to take  a look what happens with all of these images   right you can see that these are some batches  and these are some some of our labels some of   our annotations for all of these images and then  these are some of the predictions for these images   right so we are going to take a look what happens  here and for example I'm going to show you these   results, the first image, and you can see  that looking at this image which again these are   not our predictions but this is our data these are  our annotations these are our labels you can see   that there are many many many missing annotations  for example in this image we only have one   mask we only have the mask for one or four ducks  we have one two three four five dogs but only one   of them is annotated we have a similar behavior  here only one of the ducks is annotated here   is something similar only one of them is annotated  and the same happens for absolutely every single   one of these images so there are a lot of missing  annotations in this data we are currently looking  at and if I look at the predictions now these are  the same images but these are our predictions we   can see that nevertheless we had a lot of missing  annotations the predictions don't really look   that bad right for example in this case we are  detecting One Two Three of the five Ducks we   so we have an even better prediction that we have  over here I would say it's not a perfect detection   but I would say it's very good right it's like  it's not 100% accurate but it's like very good   and I would say it's definitely better than the  data we used to train this model so that's what   happens with the first image and if I take a look  at the other images I can see a similar Behavior   right this is the data  we used for training this algorithm and these are   the predictions we got for these images and  so on right it seems It's like exactly the   same behavior exactly the same situation for  this image as well so my conclusions by looking   at these images by looking at these predictions  is that the model is not perfect but I would say   performs very well especially considering that  the data we are using to train this model seems   to be not perfect seems to have a lot a lot  of missing detections have a lot of missing   elements right a lot of missing objects so  that's our conclusion that's my conclusion by   looking at these results and that's  another reason for which I don't recommend   you to go crazy analyzing these plots because when  you are analyzing these plots remember the only   thing you're doing is that you are comparing your  data the data you are using in order to train this   model with your predictions right the only thing  you're doing, you're comparing your data with   your predictions with the predictions you had with  the model right so as the only thing you are doing   is a comparison between these two things then  if you have many missing annotations or many missing   objects or if you have many different errors  in your data in the data you're using to train   the algorithm then this comparison it's a little  meaningless right it doesn't really make a lot of   sense because if you're just comparing one thing  against the other but the thing you are comparing   with has a lot of Errors it has a lot of  missing objects and so on maybe the comparison   doesn't make any a lot of sense whatsoever right  that's why I also recommend you to not go crazy   when you are analyzing these plots because they  are going to give you a lot of information but   you are going to have even more information  when you are analyzing all of these results   and this is a very very very good example of what  happens in real life when you are training a model   in a real project because remember that building  an entire dataset, a dataset which is 100%   clean and absolutely 100% perfect is very very very  expensive so this is a very good example of what   happens in real life usually the data you're using  to train the model, to train the algorithm has a   few errors and sometimes there are many many many  errors so this is a very good example of how this   validation process looks like with data which  is very similar to the data we have in real life   which in most cases is not perfect my conclusion  from this evaluation for this validation could be   improving the data taking a look what's going  on with the data and the next step would be to   improve the data and by looking at these results  one of the ways in which I could improve this data   is by using the predictions I'm getting instead  of the annotations I I used to train this model   you see what I mean if the annotations if the  predictions we are getting are even better that   the annotations maybe our next step will be to use  these predictions in order to train a new model do   you see what I mean so the by analyzing all of  these results you are going to make decisions   on how to move forward on how to continue and  this is a very good example of how this process   look like in a real project this is pretty much  how it works or how it looks like when you are   working in a project when you are working either  in a company or if you're a freelancer and you're   delivering a project for a client this is pretty  much what happens right there are errors things   happen and you need to make a decision given all  the information you get with all this analysis   so that's going to be all in order to show you  this very simple and very straightforward way in   order to validate this training process in order  to make some conclusions regarding what's going on   right and regarding to make some decisions you  know how to how to move forward this project   or this training process and now let me show you  something else which is within this directory the   weights folder this is where your weights will be  located right because if you are training a model   is because you want to have a model in order to  make predictions in order to make inferences and   this is where your models will be located this  is where your model will be saved and this is   something I have already mentioned in one of my  previous videos regarding yolo V8 remember you   will have two models one of them is called last.pt  another one is best.pt and the way it works is   that remember that when you are training a model at  the end of absolutely every single Epoch you are   updating your weights you are updating your model  so at the end of absolutely every single epoch   you already have a model which is available  which you can use if you want to so last.pt   means that you are getting the last model  the model you got at the end of your training   process so in this case I am training a network  for 10 epochs if I remember correctly so this is   the model we got at the end of the tenth Epoch and  then base.pt means that you are getting the best   model the best model you train during the entire  training process if I show you the metrics again   let's see the metrics over here you can see that  we have many metrics which are related to the   loss function and then other metrics related to the  accuracy on how this model is performing and the   way yolov8 decides what's the best model  in this case which is a semantic segmentation type   of problem may be related to the loss function  maybe it's taking the model at which you got the   minimum loss or it may be related to some of these  plots some of these performances which are related   to the accuracy to the performance maybe it's getting  the model for which you got the maximum Precision   for example or the maximum recall or something  like that I'm not 100% sure I should look at the   documentation but the way it usually goes is  that last.pt is the last Model you trained so   it's at the end of your training process and then  best.pt is your best model and this best model is   decided under some criteria so that's basically  how it works and what I usually do is taking   last.pt because I consider that last.pt is  taking, is considering, way more information much   more information because we are taking much more  data we are taking much more everything right in   all the training process we are doing many many  different things so if you take the last model   you are summarizing way more information that's  the way I see it so usually I take the last model   usually I take last.pt and that's pretty  much all in order to show you this validation   how validating this model looks like and now  let's move to the prediction let's see how we can   use this model in order to make inferences in  order to make predictions so let's see how we   can do that so let's go to pycharm let's go to  the pycharm project of today's tutorial and this   is a python script I created in order to do these  predictions this python file is called predict.py   and this is what we're going to do I'm going to  start importing from ultralytics import YOLO and   then I am going to define the model path the model  we are going to use which in our case let's use   last.pt from these results, from this directory,  so I am going to specify something like this...    last.pt and then let's define an image path let's  define the image we are going to use in order to   get our inferences so the image will be located...  this will be from the... from the validation set I'm   just going to choose a random image something  like this one so I am going to copy paste I am just going to paste it here so this  is the image we're going to use in order to test   this model in order to make our predictions  and now I'm going to import CV2 as well   because I'm going to open I'm going to read this  image and then I am going to get this image shape   so this will be something like this this will be  image and then this is image.shape okay and now   the only thing we need to do is to get our model  by doing something like YOLO and then model path   okay and then we are going to get the results by  calling model of our image right and this is it   this is all we need to do in order to get our  results in order to get our inferences but now   let's do something else I am going to iterate for  result in results and now let's take a look at   this mask let's take a look at this prediction  so I'm going to iterate like this for j, mask   in result dot masks dot data and then I am  going to say something like mask Dot numpy times 255 and this is our mask  and then I am going to resize it to the size of the image so I'm going to input the  mask and then this will be if I am not mistaking   the order is this one W and then H so this is just  the way it works this this is how we need to do it   in order to get the prediction and then in order  to resize this prediction back to the size of the   original image so this is how it goes and now the  only thing we need to do is to call CV2 imwrite   and I'm going to save it I'm going to save it here  and the name will be something like that let's   call it output this is only a test so we don't  really need to go crazy with the name let's call   it output.png and this will be our mask and that's  pretty much all that's pretty much all let's see   what happens I'm going to press play Let's see if  everything is okay or if we have some error okay   so I did get an error and yeah this is because  we need to enumerate I forgot the enumerate we   are not using J actually so I could just iterate  in mask but let's do it like this okay everything   ran smoothly everything is okay we didn't get any  error and now if I go back to this folder to this   directory I can see this is the output this is the  output we got and now in order to make absolutely   and 100% sure everything is okay and this is a good  mask this is a good prediction I'm going to make   an overlay I'm very excited I don't know if you  can tell but I'm very excited I'm just going to   take this image over here and then I'm going back  here and I'm going to take the original image I'm   going to do an overlay so this will be raise to top  I'm going to align these two images together and   now let's make a transparency and let's see what  happens and you can see that we may not get like   a 100% perfect mask but it's pretty well it's like  a very very good mask especially considering the   errors we detected in our data so this is amazing  this is a very good detection   this is a very good result so this is going to be  all for this tutorial and this is exactly how you   can train a semantic segmentation model using  yolo V8 and this is the entire process from how to   annotate the data how to train the model and how  to validate this model and then how to make some   predictions so this is going to be all for today  so this is exactly what you will be able to do  with today's tutorial in this video we're going   to work with pose detection using yolo V8 and  I'm going to show you the entire process from how   to annotate your custom data for free using a  computer vision annotation tool how to prepare   your data and your file system for training  this pose detector how to do the training in your   local computer and also from a Google collab and  how to do a super comprehensive evaluation of the   model you trained this is a much more complex  problem in my previous tutorials I showed you how   to train an image classifier using yolo V8 an  object detector and an image segmentation model   and I would say that today's model this keypoint  detector is much more complex than everything we   did before this is going to be an amazing tutorial  my name is Felipe welcome to my channel and now   let's get started and now let me show you the data  we are going to use on this tutorial we're going   to use the AWA pose dataset and let me show you  exactly how this data looks like so you can see   that these are pictures of many different animals  currently we are looking at antelopes these are   pictures of many different antelopes and if I  scroll down in this directory you are going to see   I also have other animals for example here this is  a bobcat which is some sort of feline some sort of   cat you can see that these are many different  pictures of this animal and if I scroll down   a little more you are going to see I also have  buffaloes so we also have pictures of buffaloes   and if I continue scrolling down you are going to  see other pictures of other animals for example   here I have a Chihuahua and you get the idea  right we have pictures of many many many different   animals and all these animals are quadrupeds  because this is a quadrupeds keypoint detection  dataset now let me show you the key points  we are going to be detecting for each one of   these animals and you can see that these are many  many different key points we have 39 key points   in total which is a lot and we are detecting many  different parts for example the nose the eyes the   jaw the tail the legs and also the ears the horns  or whatever they're called something like antlers   it doesn't matter we are detecting many many  different parts in these quadrupeds so this   is exactly the data we are going to be using today  I thought it was like a very very cool dataset to use   in pose detection and now let's continue so I'm  going to show you how to do the entire process   of training a pose detector using yolo V8 on your  custom data and in my case the data I am going to   use in this tutorial is already annotated right  so I already have the annotations for this data   but if you are training this pose detector on  your custom data then most likely you will need   to annotate the data yourself so I'm going to  show you how you can do that I'm going to show   you how to do the entire annotation process  using CVAT which is a very very popular and a   very awesome annotation tool for computer vision  and let me show you how to do it so I'm going to   cvat.ai this is CVAT website and I'm going  to click here where it says start using cvat   I'm going to show you how to create a project how  to create a task and how to do all the annotation   now I'm going to project and I'm going to  click the plus button I'm going to click   here and create new project and this is going  to be key Point detection this is going to be   quadruped key Point detection which is exactly  what we are going to be doing then add label   and I'm going to add quadruped continue  and that's pretty much all submit and open   this is where you are going to add absolutely  all the labels you have in your custom data in   my case I only have one label which is quadruped  now let's continue now I'm going to create a task   create new task the name of this task will be  something like quadruped key Point detection task zero zero one and I am going to add an  image I'm going to I'm going to show you how   to annotate this data with only one image  so I'm only going to select the first one   and then I'm going to click here in submit and  continue we have to wait a couple of minutes   until the data is uploaded into the server and  once everything is completed we need to go to   tasks this is our project and this is a task we  have just created and I'm going to click in open   so this is pretty much all now I'm going to  click here this is going to open the task and   now we need to start our annotation process so  you need to click here where it says draw   new points and you need to select the number of  points you are going to annotate in my case I'm   going to annotate 39 points but you need to select  as many points as you are going to annotate so now   I'm going to click here in shape and we need to  start our annotation process and something that's   very very very important is that once you are  annotating your data you need to follow a given   order right once you are annotating all of your  key points you need to follow a given order with   your key points if I show you this image again  you can see that we have many many different   key points we have the location of all the key  points but we don't really have any information   regarding the order of these key points right  this is very very important because you cannot   follow any random order you need to follow a  given order you need to follow always the same   order when you are annotating your data so this is  for example the order I am going to follow in this   tutorial you can see that the first key point  I'm going to annotate is nose then upper jaw   then lower jaw mouth end right and so on right  you need to specify a given order for your data   now I'm going to start this annotation process  so the first point is nose which I'm going to set   over here then the next one is upper jaw which  is going to be something like this lower jaw   here mouth end right and this is the right from  the perspective of this animal right so this is   going to be here now mouth and left and I don't  really see the mouth end left but I'm going to   say it's around here and I'm going to share a  few comments later on this tutorial regarding   the visibility of our key points right but for now  let's just continue now the next one is right   eye then right earbase which is here and then  right ear and which is over here and I'm just   going to continue with all of this list and I'm  going to resume this video when I'm completed   and these are the last two body middle right  which is around here and body middle left which is   around here I don't see it but is around here and  you can see that this is all these are my 39 key   points and now let me show you how you can export  this data but before, before please remember to   click save otherwise... it's always a good practice  to click save and not only you need the key points   but you also need to draw a bonding box around  your object this is very very very important and   I'm going to tell you why in a few minutes but for  now remember that not only you need to annotate   all of your key points but you also need to draw  a bonding box enclosing your object so this is   how I did it and I'm going to click save again  this is the only image I'm going to annotate   but please remember to follow exactly the same  process for all of your images I'm now going to   tasks and I'm going to show you how to export this  data you need to click here and Export task dataset   now you need to click here and you can see that  there are many many different options in which you   can export your data and one of these options is  coco key points 1.0 and this is very important   because this is the exact format we need for our  data but I have tried to export the data into this   format and it's not working for some reason it's  not working so I'm going to show you how to do it   in cvat for images 1.1 so click here then okay  and then you just have to wait until everything   is downloaded once everything is fully exported  you are going to see a file a zip file and within   this file there will be an another file called  annotations.xml now let me open this file so I   can show you how it looks like you are going to  see something like this and at the bottom of this   file you are going to see all of your annotations  and all the images you have annotated and its   annotations right so this is exactly the  data you are going to generate using cvat   now let me show you something else I have created  a python project for today's tutorial and let me   show you a script I created in this python project  and this script will be super super super useful   because now that you have your annotations now  that you have your data you need to convert   your annotations into the exact format you need  in order to use this pose detector using YOLO V8   so let me show you basically you need to specify  two variables one of them is the location of your   annotations.xml file and you also need to specify  the location of the directory where you want all   your data to be saved right this script is going  to parse through this XML file is going to parse   through this file and it's going to extract all  of your annotations and it's going to save all of   your annotations into the exact format you need  in order to use yolo V8 so remember to specify   these two paths these two variables one of them is  the location of your XML file and then where you   want all of your newly created annotations to be  saved right where you want this output directory   so once you have set these two variables the only  thing you need to do is to run this script and   everything will run super super smoothly and  remember this script will be available in The   github repository of today's tutorial so you  can just go ahead and use it in order to convert   all of your data into the format you need to use  yolo V8 and now let's continue now I'm going to   show you how you need to format how you need to  structure all of your data and your file system   so it complies with yolov8 so you can see that  this is a directory which is called data and this   is the root directory where my data is located you  need a directory which will be the root directory   where your data will be saved where your data  will be located within this root directory   you can see I have two folders one of them is  called images and the other one is called labels   it's very important that you name these two  folders exactly like this one of them should   be called images and the other one should be  called labels that's very important now if I   open one of these folders you can see I have two  other folders one of them is called train another   one is called val and it's very important that  you name these directories exactly like this one   of them should be called train and the other  one should be called val So within train is   where we will have all of our training data all of  our training images right you can see that these   are all of our images which are all the images  we are going to use as training data and within   val it's exactly the same these are all the images  we are going to use as validation data as our   validation set right so within images we have two  directories one of them is called train the other   one is called val and within each one of this  directories each one of these directories is   where we have all of our data all the data we are  going to use in order to train this model all the   images we are going to use in order to train  this model but we also have additional data which   are the labels now let me show you how this other  folder looks like you can see that within labels   we also have two directories which are also  called train and val and it's very important   that you name these two directories exactly like  this one of them should be called train and the   other one should be called val and if I open  the train directory you can see that we have   many many many txt files and these are basically  all of our labels for the training data for all of   our training images if I go back to images train  you can see that for absolutely every single one   of these images we have an annotation file right  for absolutely every single one of these images   we are going to have a txt file in this folder  and now let me show you for the other directory   for val it's exactly exactly the same but for the  validation data for the validation images right so   if I go back again you can see that we have   the root directory then images labels within   images we have two directories train and val and  within each one of these directories is where we   have all of our images and if we go to labels we  have also two directories train and val and within   each one of these directories is where we have all  of our labels so this is exactly how you need to   structure your file system and now let me show you  one of these annotations files one of these labels   files from the inside let me show you how they  look like so this is a random annotations file   this is a random txt file and this is exactly how you  need to put all the data inside these files the   annotations are specified in the Coco Key Point  format which is a very popular format for pose   detection now let me show you something I'm going  to do something which I'm obviously not going   to save the changes but this is going to be much  better in order to show you how these annotations   format works right how it looks like so basically  you can see the first number is a zero and this   is our class ID in my case I'm only I only have  one class which is quadrupled so in my case this   number will always be zero but if you are making  this project and you have many many different   classes please remember that this number should be  the class ID so if you have different classes you   will have different numbers here now the next four  numbers are the bounding box of your object right   remember in cvat when we were annotating this  data I showed you that not only we need to annotate   the key points but we also need to annotate the  bounding box right and we annotated the bounding   box so these four elements the four elements  that come after the class ID are the bounding   box right and this bounding box is specified in  the yolo format which is the X and Y position   of the center of the bonding box and then the  width and then the height of your bounding box   this is very important so this number these two  numbers are the X Y position of the center of your   bounding box and then the width and the height  and then all of the other numbers let me show   you, you can see that we have these two numbers  which are a float and then we have the number 2   and then we have exactly the same two numbers and  another 2 then two numbers and another 2 then   we have three zeros right this looks like very  very strange so now let's go back to my browser   because I want to show you this website which is  cocodataset.org and this is where we are going to   see exactly how this format works so if I go back  to key Point detection you can see that this is   our explanation about how this format works and  if I read something which is here you can see   that absolutely every single key point will be  specified as X and Y and a visibility flag V so   this means that for absolutely every single key  point we are going to have three values we are   going to have the X and Y position of that given  key point and we are also going to have another   value which is V which is the visibility right  remember we were going to talk about visibility   later on this tutorial this is later on  this tutorial so you can see that V has three   possible values V could be zero and this means  that the key point is not labeled and in this   case X and Y is going to be 0 too or V equal 1 and  this means the key point is labeled but it's not   visible or V could be 2 and this means the key  point is labeled and it's also visible and if we go   back to the to this file to the annotations you  can see that if we start over here we have two   numbers and then we have a 2 which means this key  point is annotated, is labeled, and is also visible   now if we continue you can see that we have two  numbers and then we have another two which means   this other key point is also visible now if we  continue you can see exactly the same two numbers   and then a two and then if we continue you can see  that this... we have three zeros and we are in this   situation right V equals zero so we also have x  and y equal to zero and this means the key point   is not labeled for this image right so long story  short after the bounding box all the other numbers   will be the key points and you will have two  values for the X and Y position and then the third   value will be the visibility of that given key  point now this is one of the possible formats in   which you could format your data and this is going  to work just fine but YOLO V8 also supports a key   Point annotation with only two values which means  that if you don't have the visibility information   for all of your key points then it doesn't matter  because yolo V8 also supports you input your key   points with only the X and Y coordinates so long  story short we have the first number which is the   class ID then we have four numbers which are the  bounding box and then all of the other numbers   are the key points and you can specify your key  points with three coordinates for every key point   which means we have the X and Y and also the  visibility for that key point or you can specify   all of your key points with only two coordinates  which means its the X are the Y coordinate of that   given key point so this is the way you need to  label your data is the way you need to structure   all of your annotations and please remember to  do it this way otherwise it's not going to work   so now I'm just going to press Ctrl z because  obviously I'm not going to save all of   those changes and that's pretty much all about how  to format your data how to format your file system   and how to put your data into the exact format  you need in order to train this pose detector   using yolov8 and now let's go back to pycharm  let's go back to the pycharm project I created for   today's tutorial and the first thing you need to  do if you want to train this pose detector using   yolo V8 is to install the Project's requirements  which is basically ultralytics so please remember   to install this package before starting with this  training because otherwise you will not be able   to train a pose detector using yolo v8 so once  you have installed ultralytics let's go back here   to this file I created which is train.py I'm going  to show you exactly what you need to code in this   file in order to do your training and in order  to do so let's go back here which is ultralytics   website and let's go to the pose page and let's  scroll down until this section over here and the   only thing I'm going to do is I'm going to copy  and paste this line and then I'm going to copy and   paste this other line right so this is basically  all we need to do in order to train this model and   obviously I need to import from ultralytics import  YOLO and that's pretty much all so this sentence   over here we can just leave it as it is we can  just leave it in this default value but this one   I am going to make a couple of changes I'm going  to change the number of epochs I'm going to train   for only one Epoch for now and I'm also going to  change the location of the configuration   file I'm going to use this file which is  config.yaml and now I'm going to show you how this   config.yaml looks like so you can see that this is  the configuration file I am going to use in this   tutorial we have three sections one of them for  data then key points and then classes and let's   go to the data section first this is where you're  going to specify all the locations to your data   to your images and your labels so basically you  need to specify the root directory the directory   containing your data which in my case is this one  remember the root directory this is the directory   which contains the images and the labels folders  and then you need to specify what's the location   of the training images and the validation images,  if you have made everything as I show you in this   tutorial as I showed you a few minutes ago then  you can just leave these two lines in these values   right you can just leave everything as it is and  everything will work just fine the only thing   you need to edit is the location of your root  directory now let's go to this section over here   which is the key points and we have two keywords  which are key Point shape and flip index and these   two keywords are completely and absolutely new for  us this is something we haven't seen before in any   of my previous tutorials about yolo V8 and you  can see that in the case of key Point shape in my   case it says 39 3. that's because I have 39 key  points and I'm using the X Y and V format right   I'm using three values for every single key point  so in my case this that's why I have a 3 over here   so this is how many key points you have in your  data and this is what format are you using if you   use you're using the X Y format in that case you  will need to specify a 2 or if you're using the   X Y and V format and in that case you will need  to set a 3 as I am doing over here so that's for   key Point shape and now let me explain what flip  index is and in order to further explain what this   keyword means I made a drawing over here where I'm  going to show you exactly what it means so you can   see that this is a random image in my data set and  actually this is the same image I used in order   to show you how The annotation process looks like  for this data and you can see this is a quadruped   with all of its key points drawn on top right  now let me show you what happens if I flip this   image horizontally right this is what I get you  can see that this is exactly the same image but   the only thing I did was to flip it horizontally  if I flip an image horizontally now everything   that used to be one of the sides now is the other  side right everything that used to be the right   side now is the left side and the other way around  everything that used to be the left side now it's   the right side that's only what happens when you  are flipping an image horizontally but remember   we had many many different key points and many of  these key points were related to one of the sides   for example we had a key point for the right eye we  also had key points for the right ear we had key   points for the right legs and the same situation  for the left eye the left ear and the left legs   right many of our key points are related to one of  the sides if we flip the image horizontally then   we should be doing something with all of these key  points which are related to one of the sides right   when we are training a model using yoloV8  when we are training this type of model one   of the steps one of the stages in this process  in the training process is to do something   which is called Data augmentation and this data  augmentation means that we are taking the data   and we are doing different Transformations with  this data one of the Transformations we are doing   is related to flipping the image right so we are  going to be flipping some of our images at random   and every time we are going to be doing an  horizontal flip we are going to have a situation   like this so now let's go back to this  list which is the list of all the different key   points we have in this data set right remember I  already showed you this list when I was annotating   this image and remember we start with the nose  then the upper jaw then the lower jaw and so on   so you can see that some of these key points are  related for example in this case to to the right   side this is related to the left side we have  many key points over here which are related   to the right side then we have many key points  which are related to the left side then we have   other key points which are not related to any of  the sides for example neck base neck end throat   back these are generic key points and they are  not related to any of the sides and we will need   to do something with all the key points which are  related to one of the sides for example these two   then all of these over here and so on right you  get the idea that's exactly what we need to do   and that's exactly what this flip index keyword  does right that's exactly the idea the intuition   behind this flip index so let's go through this  list you can see that the first element is nose   and if we think about a nose it's right in the  middle and nothing is going to happen when we   flip the image right the nose will continue  being the nose will remain as the nose and   then the next element is the upper jaw exactly  the same nothing will happen with the upper jaw   will remain being the upper jaw after we flip the  image horizontally the same will happen with the   lower jaw but when we get to this element this  is the mouth end right and we will have an issue   here here because the mouth end right when we flip  the image horizontally will be the mouth end left   and the next element which is mouth end left when  we flip the image horizontally now it will be the   mouth end right you get the idea these two values  these two key points will be flipped when we flip   our image right and now let's take a look at this  list we have over here which is the value for flip   index and you can see that the first element is  zero then one then two then four three right so   we are flipping these two values right instead of  having a 3 4 which will be like the natural order   we have a 4 3 we are flipping these two values  and these are exactly the indexes of these two key   points in the key Point order so long story short  the only thing we will need to do in order to fix   this issue we will have when we are flipping our  images horizontally the only thing we will need to   do is going through all of our key points and all of  the key points which are related to the right side   we need to flip them in order to make them the  left side right we only need to flip the right   and the left side that's the only thing we need  to do and that's what we need to specify here in   this list this is how the flipping will be done  so please be super super careful with this list   and this remember means how your indexes will be  flipped when the image is flipped horizontally   now let's move to this section and these are all  of your names of all of your objects in my case   I only have one object which is quadrupled so in  my case this is very simple but please remember   to specify all the names and all the class IDs  for absolutely all of your names in my case I   only have one in class ID which is zero and means  quadrupled so that's pretty much all for the   config.yaml and now we let's go back to train.py  and let's continue so once you have specified   this configuration file and you have specified  everything we have over here the only thing you   need to do is to execute this script and that's  all that's how easy it is to train this model but   I'm going to stop this training because otherwise  it's going to take a lot of time if I train this   model locally it's going to  take a lot of time I have been doing some tests   already and yeah it's going to take forever if I  do it locally but this is exactly the process you   should follow if you want to train this model  in your local environment but I mentioned that   I'm also going to show you how to train it in a  google collab so now let's go to my browser and   let's see exactly how we can do this training from  a Google collab the first thing you will need to   do is going to your Google Drive and you will need  to upload absolutely all of your data obviously   because otherwise you will not be able to train  this model from your Google Drive and also you   will need to upload your config.yaml file and  everything will be just exactly the same as the   file I showed you in my local computer but you will  need to edit this field which is the path right   you can see this path over here you will need to  edit this with the path to your data in Google   Drive this is very important and otherwise nothing is  going to work so please remember you need to edit   this path and I'm going to show you exactly how  to know what's the location of your data in your   Google Drive now let's go back to this Google  colab this is the Google colab I created for   this tutorial for this training and obviously you  will be able to find this notebook in the GitHub   repository of today's tutorial so for now just  follow along you can see that we have only   a few cells and the only thing I'm going to do  is to execute these cells one by one so I'm going   to start with the first one which is connecting  my Google colab environment with Google Drive   and now the only thing I have to do is to select  my account then I scroll all the way down and I   click allow and that's basically all we need to do  in order to connect our Google collab with Google   Drive so this way Google collab will be able to  access the data you have in your Google Drive   we have to wait a few seconds and that's pretty  much all you can see that everything has been   mounted here in content gdrive and now the  next step is to install Ultralytics right because   remember we are going to use ultralytics which is  the python package we need to use in order to use   yolo V8 and the only thing we need to do is to  execute this cell and everything is now completed   now in order to continue with the next two cells  you need to know where your data is located in   your Google Drive the only thing I'm going to do  is to execute this cell and this is going to list   absolutely all the files in  my Google Drive and in my root directory right   you can see these are many many many files and  from here the only thing I would need to do is to   find where my data is located in my case it's  located in this folder if I do an ls again   this is the content of this folder so the only  thing I need to do is to locate this directory   and then that's it right this is  the content of this directory which   is where my data is located so the only  thing I need to do is something like this   and that's the content of my data  directory which contains the two folders images   and labels now that you know where your data is  located in your Google Drive now you can just   copy and paste this path in your config  file right now that you know exactly where your   data is located you can just come here and  you can edit this field and you can just put   wherever your data is located and then you need  to specify the location of your config.yaml file   and once you set this location over here you are  all set and the only thing you need to do is to   click enter right you need to execute this  cell and that's going to be pretty much all   now everything is being executed this is going  to take some time the first thing is going to do is   download all the weights into this Google collab  environment and that's pretty much all it's going   to get all the data and then it's going to do  the training okay now the training process has   been completed and you can see that the results  have been saved here in run pose train4. so   I'm going to show you how to execute this cell so  we copy the entire content of the runs directory   into your Google Drive Right remember that the  idea of this training process is to download the   results to download the weights to download the entire  results which have been saved here and the way to   do it or one of the ways to do it I would say the  easiest way to do it is to copy everything into   Google Drive and then just download everything  from Google Drive so this is the simplest   way to do it and please remember to edit this path  to the path where you want everything to be copied   when you execute this cell so let me show you if  I go back to my Google Drive you can see this is   the runs directory which was just generated  and within this directory is where we have   this train4 which is the result of the training  process we have just executed so everything seems   to be okay and what we will need to do now is to  download this directory into our local computer   now remember this was a very very dummy training  this is a training we did for only one Epoch   obviously you will need to do like a deeper  training if you really want to train your pose   detector on your data one Epoch it's it's very  unlikely to be sufficient you will need to do   like a deeper training in my case I have already  trained the model with my data and I did it for   100 epochs right I did it before starting this  tutorial so everything is already trained and   we can just analyze the results so now let's move  to my local computer so I can show you exactly how   to validate the model I trained using yolo V8  and you can see that these are many many different   plots many different functions we are plotting a  lot of information but we are going to focus in   the loss function and specifically we are going to  focus in this loss which is the loss function related   to the pose so we are going to focus on the pose  loss related to the training set and the pose loss   in the validation set and if we look at the training  set you can see that the loss is going down but   not only is going down but I would say that's going  to continue going down for even more epochs right   you can see that the trend is that it's going  down and it's going to keep going down for more   iterations for more epochs we haven't reached  a plateau and I would say that we are very far   away of any Plateau right so this is a very good  sign and it means the training process is going   super well and it means that the model has  extra capacity it has more capacity so we could   continue training this model and it will continue  learning more about this data that's what I take   by looking at this pose loss in the training set  and if we look at exactly the same function but in   the validation set we can see that it's going down  so that's a good thing but I have the impression   that it's starting to be something like a plateau  right it's not very clear because it's happening   right in the end of this  training process but you can see that it somehow   seems like this is going to be a plateau from now  on right at the very least we can see that it's   going down that's absolutely and 100% clear and  then it's unclear what will happen from now one or   what would have happened if I would have trained  this model for more epochs but it seems we may have   reached a plateau and that's something we need to  keep in mind in this validation process but now   let's take a look at exactly how it's performing  with some images so the way I'm going to do it is   like this I am going to open this image which is  the it's one of the batches in the validation set   and these are our labels right these are not  our predictions but these are the labels the   annotations now I'm going to keep this open and  I'm going to open exactly the same batch but our   predictions right this is going to be a very good  way to analyze the results because now we have a   lot of images and we need to make more conclusions  we need to take a look at more samples we   need to... this is a much more complex problem in  my previous tutorials I showed you how to train   an image classifier using yolo V8 an object  detector and an image segmentation model and I would   say that today's model this keypoint detector is  much more complex than everything we did before so   this validation process will be more complex as  well now if you look at all of these images I'm   going to focus on only one of them I'm going to  focus on this dog on this Dalmatian and I'm going   to show you exactly what's going on here let's  focus on this animal first and you can see that   basically these are all of our key points basically this is our ground truth these are all of   our annotations and these are all our predictions  and you can see that it looks pretty well it looks   very very well I would say that if we look at all  of these key points which are around the face   I would say they are perfect I would say they are  very very good and then if we look at these   keypoints over here you can see that it's very good  as well if we look at these three key points it's   also very good these two key points over here as  well it's it's very good too and then if we look   at the legs I can see something is going on here  because I don't really see these two key points   so we are not detecting the legs and if I look at  the legs entirely I would say something is going on   because I don't really see... I think we have an  issue in the legs and also you can see that this   key point which is at the end of the tail we are  not detecting this key Point either so we   have some issues we have an issue in the tail and we  have an issue around the legs but everything else   I would say that is pretty good I don't know what  you think but I think it's pretty pretty good so   this is one of the examples and now let me show  you another one which is in another batch   again these are the annotations and these are the  predictions so let me show you what happens in   this rhinoceros let me show you what happens  these are our annotations and these are our   predictions and you can see that we have a similar  situation around the face I would say everything   it's just okay we have like a very good detection  then over here we have very good detection too   these three points are very well detected then  over here everything is okay and then we also   have an issue around the legs right we are not  detecting all the key points in the legs properly   the same happens over here with this other leg and  the same happens in the tail, this keypoint   which is the end tail and then everything  else seems to be working super properly we are   detecting all the key points but we have an issue  around the legs and around the tail and now let me   show you other examples for example this one over  here you can see that in this case we have the   animal in a different posture so it's a little  more challenging for the model and you can   see that in this case we are detecting the face  very very well actually we're not detecting this   eye but other than that all of the other key  points around the face are very well detected   and you can see these three key points over here  everything is okay this one is okay this one too   and then you can see that we have other key points  which are also very well detected but we have   an issue again around the legs right and that's  pretty much what I noticed by looking at many of   these examples right in many of these situations  we have many different situations because there   are different animals they are in different  postures they are in different everything so   we are going to notice different situations but  after inspecting a few of these images I had the   impression that the model is performing very very  very well but we may have an issue around the legs   and around the tail that was my impression  by analyzing many of these pictures so by   combining this information all the information  we got by analyzing this images, all the   keypoints, how they were detected and so on and  also combining everything with the loss function   with this plot regarding the loss function in  the training set and in the validation set my   conclusions from here will be to make a deeper  training to train this model for even more epochs   and I'm curious to see what will happen in that  situation because if I look at the training loss I   really like what I see I think this model have way  more capacity I think we could train for I don't   know 50 more epochs 100 more epochs and I think we  will be in a very good situation the training loss   will continue to go down and it will continue  to go down in this way right it seems we are very   far away from the plateau but if I look at the  validation loss I'm not completely and   absolutely sure what happens from now on so what  I would do now in order to improve these results or   in order to try to improve these results will be  to continue training for more epochs and I would   see what happens next I would see what happens  with this loss and then I would see what happens   by analyzing these images again right that would  be my next step by analyzing all this information   so this is a very good example of how to analyze  your model how to analyze your data and your plots   and so on in a more complex example as this one  because remember that now we are trying to detect...   now we are trying to learn something that's way  more complex as we did in our previous tutorials   now we are not trying to learn like a bounding  box or a mask but we are trying to learn the   entire structure of a quadruped so that's... trust  me it's way more complex than everything we have   made so far so this is a very good example of how  to validate a model when the problem is a little   more complex take a look at the loss function  take a look what's going on take a look in the   training set and the validation set and also  take a look at some examples and then just make   some conclusions in my case what I would do is to  train for more epochs and also please remember   that we are always using the default values of  this training right we are training this model   using all the default values the only values we are  specifying are the image size and the number of   epochs and that's it and if I show you the entire  configuration file we are using is this one and   you can see that we have many many many many many  many hyperparameters so another next step in case a   deeper training is not enough another Next Step  would be to play around with the different hyper   parameters and to find another combination of  the parameters which would be better for our   use case that's very important because if you are  approaching a more complex problem like this one   like the one I am doing right now I would say that  it's not very realistic to expect everything goes   super super well from the first attempt by using  all the default values right I would say    if the problem you are trying to solve it's  much more complex then most likely you would need   to play around with the different hyper parameters  and you would need to find a combination of hyper   parameters that suits well with your problem  with your project so that's what I can say   about this validation process and now let me show  you something else which is where the weights are   located within this folder you will see another  folder which is called weights and within weights   you are going to see two files which are  best.pt and last.pt these are the modes you   generated with this training process and this  is something that I have already mentioned in   my previous tutorials but I'm going to say it  again last.pt is the model you trained at the   end of your training process and best.pt means  that this is the best model you trained in the   entire training process so you have these two  models and you can just choose the one you like   the most and what I usually do is taking last.pt  I consider that this is a much more robust   model so this is the one I usually consider when  I'm making my predictions and that's pretty much   all I can say about this validation, about validating  this model and now it's time to make our   predictions so let's get back to pycharm let me  show you this file which is called inference.py   this is the file we are going to use in order to  make predictions with the model we just trained   so let me show you how to do it I'm going to  start importing from ultralytics import YOLO ultralytics import YOLO and then  I'm going to Define my model path   I am going to specify the location of the model  we just trained right which is this one in my   case this is the location of my model I'm going  to select last.pt and I'm also going to set   the path to an image, to an image I'm going to use  in order to show you how to make predictions my   image will be located in samples wolf.jpg let me  show you super quickly the image I'm going to use   in order to show you how to make predictions with  yolo V8 let's go to samples and this is exactly   the image I am going to use you can see that  this is the image of a wolf which is obviously a   quadruped so it's going to be an amazing image in  order to show you how to use this model now let's   get back to pycharm and let's do something  like this I'm going to define my model like   YOLO model path and then we're going to  say something like results equal to model   image path and I'm going to select the first  element because as we are predicting only one   image the first element will be just fine and then  it's just about iterating for result in results   and this will be something like for keypoint  in result key points dot to list there and for now the only thing  I'm going to do is to print keypoints   so we make sure everything  is okay and let's see what happens okay it seems I have an error and I think I know  what's the error to list goes without the   underscore so let's see now okay now everything  seems to be okay and what I'm going to do now   is I'm going to import cv2 because I am going to  read the image and I'm going to plot all the keypoints   on top of this image right that's going to  be a very good way to show you how to predict all   these key points so cv2 imread image path this  is going to be image and now I'm going to call   CV2 dot put text maybe it's a very good idea to  put the text of each one of these key points to   put the key Point number on top of each one of  these key points so this is going to be image   then the key Point number will be... let's do it  like this keep point index key point in enumerate okay and then string key Point  index okay now the location and remember this is how my key points look  like we have three values and the values   we care about in this moment are these  two because these are the X Y position   of the key point so I'm just going to  do something like int keypoint zero   and int keypoint one okay now I have to select  the font which I'm going to set in this one font   cursey Simplex then the font size which I'm going  to set in one for now then the color something is   not right let me see I think I'm not closing  these brackets I think that's reason one two okay there let's see now everything's okay okay now the color which I'm going to set in  green so this is something like this and   then the text width which I'm going to  set in two and this is going to be all for   now now let's see how it looks like right I'm  going to call cv2 imshow image and my image   then cv2 wait key zero okay and that's pretty  much all what we are doing here is plotting   the image and drawing all the key points on top  of this image with the key Point number on each   key point right so it's going to be easier to  know exactly what we are detecting in the entire   image and as a result everything looks pretty well  but I'm going to do something so I can improve the   visualization I am going to make the font size  equal to 0.5 and I'm going to press play again   okay now the visualization is a little better and  you can see that everything looks pretty pretty   well right we are plotting all the key points on  top of our image and this is exactly how you can   make predictions using YOLO V8 so the last thing  I'm going to show you is to open this file the   class names and let's take a look at exactly what  we are detecting right so you can see here 0 we   have the nose then upper jaw lower jaw mouth end  right mouth end left and so on right you can see   that for example 21 we are somewhere around here  which is back middle it makes sense then 37 we are   around here body middle right and then 36 belly  bottom so everything looks pretty pretty well   and you can see that we are still getting the  issues we notice in the other pictures which   is the legs are not very well detected and the  end tail is not very well detected either but   everything else seems pretty pretty pretty well so this is going to be all for this tutorial  on yolo V8 my name is Felipe I'm a computer vision   engineer and this is exactly the type of videos and  the type of tutorials I make in this channel if   you enjoyed this video I invite you to click the  like button and I also invite you to take a look   at these two videos over there this is going to be  all for today and see you on my next video [Music]
Info
Channel: Computer vision engineer
Views: 27,759
Rating: undefined out of 5
Keywords:
Id: Z-65nqxUdl4
Channel Id: undefined
Length: 208min 10sec (12490 seconds)
Published: Wed Jun 14 2023
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.