Train Yolov8 object detection custom data in the cloud GPU | AWS project | Computer vision tutorial

Video Statistics and Information

Video

Captions Word Cloud

Captions

hey my name is Felipe and welcome to my channel in this video I'm going to show you how to train an object detector using yolov8 and this exactly the project in which we will be working today so I'm going to show you how to do the entire training process on an EC2 instance in AWS and then once the training process is completed we are going to send a notification to our email so this is going to be an amazing tutorial and now let's get started, so let's get started with this tutorial and let me give you more details on exactly what we are going to be doing today so we are going to train an object detector using yolov8 and we're going to do the entire training process here on an E2 instance but we are also going to create an S3 bucket and we're also going to create an SNS topic and the way this project works is that we are going to download all the data we are going to use in order to train this model from an S3 bucket then we're going to train the model in an EC2 then we're going to upload the results from this training process back to the S3 bucket and then once the training process is completed we are going to send a notification to this SNS topic and then we are going to receive a notification into our email and then once everything is completed we are going to shutdown the instance we launched in order to train this model, we are going to automatically shutdown this instance so this is a very very cool project this project is ideal for beginners and let's get started with this tutorial and before we continue let me show you the data we are going to use today, you can see that all of these images are about cars and this is the data we are going to use in this project and this data set comes from one of my previous courses where I show you the entire process of how to create a real time automatic number plate recognition project in that course I created this dataset in order to train an object detector to detect cars and also license plates and you can see that those are exactly the objects we have in absolutely all of these images we have cars and in some of these images we also have license plates so I invite you to check out this course so you get more familiar with the data we are going to use in this tutorial and also this previous course is an amazing way to get more familiar with how to train an object detector and also how to work with AWS so definitely check out this course now let's go back here so this is the data directory which is called data out and I have already create this file over here which is data out . zip, this is a very important file we're going to use later on now let's get started with this tutorial let's go to AWS let's go to my AWS console and now let's get started so I'm going to show you how to create all the products we're going to use today and how to do absolutely everything we need to do in order to train this object detector so let's get started by creating an S3 bucket so I'm going to click over here on S3 and then I'm just going to create a new backet, the name of this bucket will be something like train object detector EC2 tutorial bucket something like that and I'm going to scroll all the way down and I'm going to click on create bucket I'm going to select the bucket we have just created which is this one and now I'm going to create two directories one of them will be called in... and the other one will be out... within the in directory we are going to put the data we are going to use in order to train this object detector so the only thing I'm going to do is to select the data, to select this file data out zip and I'm going to drag and drop this file over here... please mind that this process may take some time depending on your file size, in my case this is 11 GB large so this is going to take some time for sure and now the only thing you need to do is to wait until this process is completed, okay we have uploaded the file into the S3 bucket and now let's continue creating the SNS we are going to use in order to send a notification once the training process is completed so let's continue I'm going to type SNS and I'm going to click here on simple notification service then I'm going to click on topics, create topic, I'm going to select a standard topic this is very important because we are going to use an email in order to send this notification and this is the only one we can use in order to send an email so the name of this topic will be something like train object detector EC2 tutorial SNS something like that and then I'm going to scroll all the way down and I'm going to click on create topic so the topic has been created and now let's create a subscription in order to use this SNS I'm going to select here on protocol I'm going to select email and then I'm going to type hello at computer vision school which is this YouTube channels email okay so this is where we're going to send a notification once the training process is completed create subscription... okay now the subscription has been created and now let's get back to the SNS topic and let me show you something you can see it says pending confirmation because we need to confirm the new subscription we have created for this SNS so I'm going to my email and you can see that I have received this notification this email from AWS which is a subscription confirmation the only thing I need to do now is to confirm subscription and this is going to be pretty much all now this subscription has been confirmed and if I go back here and I refresh... you can see now it says confirmed so everything is okay now when the training process is completed we're going to send a message to this SNS and then we are going to receive the notification in this email okay now let's continue we have created the S3 bucket we have uploaded the data into this S3 bucket and we have also created the SNS we are going to use in order to send that notification once the training process is completed the only thing we need to do now is to create the EC2 instance and we need to work on this EC2 instance but before we create the ec2 instance we are going to create an IAM role because remember we need to communicate at with S3 an SNS from the ec2 instance so we definitely need to provide this ec2 instance with the necessary permissions in order to communicate with S3 and SNS so I'm going to IAM and I'm going to click here on role I'm going to click on create role and I'm going to select the service I'm going to select EC2 then next and I'm going to select these two policies SNS full access... and s3 full access... okay and I'm going to click on next and the name will be something like train object detector EC2 tutorial role okay and I'm going to click on create role... okay the role has been created and now we can go to ec2 and we can launch a new ec2 instance so I'm going to... I'm going to type ec2 and I'm going to click here on ec2... I'm going to click on launch instance and the name will be train object detector ec2 tutorial instance something that... this will be Ubuntu and then I am going to select this instance type which is p2 large something over here... ok p2 x large then I'm going to scroll down I'm going to select my key pair remember if you do not have a key pair this is a time to create a new key pair but in my case I already have one so the only thing I'm going to do is to select my key pair now I'm going to scroll down and this is very important I'm going to create an EC2 with 60 GB of space right and then something else we need to do which is very important is to click here on Advanced details and I'm going to change this value over here which is shutdown behavior you can see it says stop I'm going to change it for terminate right this is very important because at the end of our training process we are going to shut down this instance and we are going to terminate it right that's what we want to do once the training process is completed once we have uploaded all the results into the s3 bucket once we have sent a notification to the SNS we just created then we want to terminate this ec2 instance so we know for sure that we are not going to be charged anymore by this instance right so I'm going to click on launch instance... the ec2 instance has been created and now the first thing we need to do is to attach the a role we created into to this instance so I'm going to click here on actions then security obviously I need to select the instance first and then actions security modify IAM role and I'm going to select the IAM role we have just created which is this one train object detector ec2 tutorial role update IAM role... okay and now it's time to SSH into this ec2 instance so I'm going to copy this value over here which is the public ipv4 DNS and then the only thing I'm going to do is to type something like this which is SSH minus I then the location of my key pair and then Ubuntu at this value we have just copied over here I'm going to type enter... then yes... okay now we are logged into this EC2 instance and the first thing we need to do is to install the cuda drivers because remember we have selected a p2 xlarge type of instance and we have selected this instance type because we want to we want to use a GPU right we want to make this training process on a GPU but in order to use the GPU of this ec2 instance the first thing we need to do is to install the coda drivers so this is what we need to do I'm going to follow these steps I have over here the only thing we need to do is to type each one of these steps one at a time we need to execute each one of these instructions and this going going to take care of installing all the drivers we need in order to use cuda so I'm going to copy and paste the first one then I'm going to continue with this one... then the next one... this going to take some time okay I'm going to press enter and that's pretty much all so the next step in this process is to execute sudo reboot so this is going to reboot this ec2 instance and now we need to wait a couple of minutes in order to get back into the EC2 instance... okay now I'm going to SSH again into the EC2 instance and everything should be ready in order to continue now the next step in this process will be installing virtualenv we are going to install all the python packages we need in order to work on this tutorial but we are going to install all these packages in a virtual environment remember it's always a very good practice to install a a new virtual environment to create a new virtual environment when you are working on a python project so we are going to use virtualenv and if I'm not mistaken we will need to install virtualenv first I'm going to type virtualenv enter and we need to install it first so I'm going to copy and paste this instruction over here and this is going to take care of installing virtualenv okay now I'm going to create a new virtual environment calling virtualenv venv python python 3 point... only Python 3 okay and this should be enough in order to create a new virtual environment so the only thing I'm going to do now is to activate it source venv bin activate okay and now we're going to install all the python packages we need in order to work on this project which are two ultralytics and boto3, ultralytics is a very important package we need in order to work with yoloV8 which is the object detector we are going to be using today and then boto3 is also a very important package and it's actually the python package we are going to use in order to communicate with S3 and SNS so we definitely need to install boto3, so I'm going to type pip install boto3 ultralytics and this is going to take care of installing these two python packages... okay so we have installed boto3 and ultralytics and now we need to make sure that ultralytics is using the GPU actually we need to make sure that pytorch is using the GPU because ultralytics is only a very high level framework which under the hood uses pytorch so we need to make sure that pytorch is going to use the GPU once we start this training process remember the whole idea of using this type of instance, this instance type is because we want to make this training process on a GPU so let's do something in order to verify that pytorch is going to use the GPU I'm going to follow these instructions which is basically something like this I'm going to run python this is going to run run a python environment and now I'm going to execute these two instructions the first one is import torch this is going to import pytorch and then we are going to verify if pytorch is using the GPU and you can see we got an error because pytorch by the way is configured or by the way it has been installed is not using the GPU so this is something that we definitely need to solve because otherwise it doesn't make any sense whatsoever to be using an instance type with GPU so now this is what we need to do next I'm going to run an additional command an additional instruction which is going to install pytorch but it's going to install a very specific version of pytorch and this is the one we need in order to use the GPU long story short the only thing we need to do now is to execute this instruction and this is going to solve the issue we currently have about pytorch is not using the GPU so the only thing we need to do now is to wait a couple of minutes and this is going to be all... okay and now I'm going to do exactly the same I'm going to type python I'm going to import torch and then I'm going to run this instruction over here... and you can see that now we get zero which means that everything is working just fine and now torch is using the GPU so we are safe in order to continue I'm going to exit this environment and that's pretty much all okay I'm going to clear this window and now we are ready in order to start working on the main py file we are going to use in order to take care of the entire process remember there are many things we need to do and let's just write the entire process we are going to follow in this tutorial so there are five steps we need to take in order to complete this process the first one will be downloading the data from the S3 bucket remember we have uploaded the data here into to this S3 bucket and the first step we need in order to train this object detector and in order to complete this process is to download the data from S3 okay then the next step will be to train the model right once the data is in this EC2 the only thing we need to do is to train the model using YOLOV8 then the next step in this process will be taking the results we get from this training process and uploading these results back to the same S3 bucket from which we got the data right so this is going to be the next step in this process upload results... to S3 okay then the next step in this process will be sending the notification to... send notification to... SNS okay and then the next step will be shutting down this ec2 instance... something like this okay so you can see this is is a process which comprises of only five steps and in only five steps we will have completed this process we will have trained this object detector we will have uploaded all the results into the S3 bucket we are going to send a notification to SNS and then we're going to shut down this EC2 instance so let's see how we can complete each one of these steps so this is going to be a very long process and the best way to do it is by doing it one step at a time so let's get started with the first step in this process which is downloading the data from S3 and in order to do so we need to import boto3 and we need to say something like this the first step will be defining the S3 client we are going to use in order to communicate with s3 and this going to be boto3 client s3 okay then the next step will be get getting the data we have uploaded to S3 so S3 client . download file... and we need to input three parameters the first one will be the bucket name and remember the bucket name is something like train object detector ec2 tutorial bucket if I'm not mistaken it was something like this we can change it later on if I'm mistaken but I think it was something like this, then we need to write the object key we are going to download from this S3 bucket and the object key was in and then data out . zip and then we need to write the filename where we are going to download this file into this EC2 instance and this is going to be data out dot zip okay and then once we have downloaded this file we need to extract all its content right because remember this is a zip file so in order to use all the data which is within this file we need to extract its content so I'm going to make another import I'm going to import zip file... and then I'm going to say something like this, with zip file dot zip file then the name of this file which is data out dot zip, this is r... and the only thing we need to do is something like this zip ref extract all... and I'm going to extract all the content of this file in the current directory something like this... okay this is going to be just fine and in order to make sure everything works as expected let's just execute this file as it is, let's just execute this part over here and let's make sure everything works just fine so I'm going to close the file and I'm going to do something like this I'm going to clear... and then python main. py okay I have mistake it seems this is an uppercase this is zip file with an uppercase let's see now now I am going to remove the file we have just downloaded and I'm going to try again... okay the execution is completed and now if I do ls you can see I have the zip file we have downloaded from the s3 bucket and I have also this directory over here which is the directory we have extracted from the zip file, if I do ls data out you can see that this is all the data we are going to use in order to train the object detector so let's continue with the next step in this process which is training the model and in order to train the model we are going to use ultralytics so I'm going to import ultralytics remember this was the other python package we installed in this virtual environment so I'm going to import... I'm going to import from ultralytics... import yolo... okay and this is how we're going to do... the first step will be creating a new variable which is called model and model will be equal to YOLO and then we are going to say something like yolov8 nano . pt we are going to retrain we are going to fine tune a pre-trained model from YOLO V8 which was trained on the coco dataset and now the next step will be just calling model . train we are going to input a config file... config dot yaml this is a file we have not created yet but we're going to create it in a couple of minutes and we are also going to input the number of epochs which I'm going to set in one so only one epoch is going to be enough for now in order to test if everything works fine so I'm going to get back here and now it's time to create this file which is config dot yaml and in order to create this file I'm going to show you a project a GitHub repository I created for another project which is this one this is another project where I show you the entire process of how to train an object detector with yolov8 this is a super super super comprehensive tutorial where I show you absolutely every single detail of how to train a model using YOLO V8 and in this repository I am going to local env and I'm going to open this file which is config dot yaml so I'm going to take this file as a baseline and and I'm going to make some edits on this file so I am going to copy over here and then I'm just going to paste it over here... copy and then paste okay I pasted it twice so I'm going to open config yaml again and I'm going to paste it over there okay so I'm going to make some edits the first one will be the path the absolute path to the data and this is going to be something like home ubuntu and then data out... if I get back here and I print the current directory the working directory you can see that this is home ubuntu and then the data is here in data out so this is exactly the absolute path to the data to the data directory then this is the relative path to the training images so this is perfect as it is then the relative path to the validation images this is going to be different I'm going to do images val and then all the classes we are going to detect in this dataset we are going to detect exactly four classes the first one will be license plate then car then buses and then tracks remember this is a dataset I created for one of my previous courses so if you want more details on exactly how this dataset was created and exactly all the classes and absolutely all the details about this dataset I definitely invite you to take a look at this course but for now let's continue so this is going to be just fine in order to create the config dot yaml file and now let's get back here to main.py and this is going to be pretty much all we need in order to train this model but before we execute this file again we are going to do something which is... I'm going to comment this section over here because we have already downloaded the data so if we execute this python file again we are just going to download the data again and we're going to uncompress the data again and it doesn't make any sense because the data is already downloaded into this ec2 instance so in order to move one step at a time and in order to make things incrementally we are just going to comment this section over here which is about downloading the data and now we're going to execute this file again but we are only going to execute the part of training the model right this is the only part we are going going to test now so I'm going to save the changes and the only thing I'm going to do is to execute python dot... python main.py okay we have a mistake and I think I know what's the problem... this should be data equal to config dot yaml let's see now... okay now everything seems to be working just fine and the only thing we need to do now is to wait until this training process is completed remember we are training for only one epoch because we want to make sure everything works fine and the only thing I'm going to do now is to wait until this training is completed and then we are going to continue... okay the training process is now completed everything is just fine and you can see that we have saved all the results here in runs detect train so this is what I'm going to do now I'm going to clear this window and let me do an ls you can see that this is the directory we have just created which is runs and now let's continue with the next step in this process which is... uploading the results back to the S3 bucket so this is how we're going to do now we have created this new directory which is runs and these are all the results of our training process and now we're going to do something similar as we did over here which was uncompressing all the data we had over here but we are going to do the opposite right we're going to take this directory, the directory runs, the directory have created and we're going to compress this directory we're going to create a zip file with all the content from the runs directory so this is how we're going to do I'm going to say with zip file dot and a capital F, file, and this is going to be the name of the file we are going to create will be runs.zip this is a W because we're going to create this file I'm going to make it as zip and then we are going to iterate in all the directories and in all the files... doing something like this... os walk and the directory... and then... right we are... we are walking through all the files in this directory because we want to create a zip file with all the content within this directory so we are going to do something this and then for file in files... file name will be os path join path and then file and then the only thing we're going to do is to call zip write file name okay and this is pretty much all so the only thing I'm going to do now is to import os otherwise this is not going to work and that's pretty much all okay yeah if I'm not mistaken this is going to work just fine and I'm going to do exactly the same as we did over here I'm going to comment this section regarding the training... regarding training the model because if we do not comment this section we are going to train the model again and we don't want to train the model again because we have already trained the model and we already know everything works well so now let's just execute the file again so we make sure this part over here works as expected, I'm going to save the changes and I'm going to say something like python main.py... okay if I do ls you can see we have created a file which is runs zip okay and now it's time to upload this file into the S3 bucket so we are going to get back here... and the only thing we need to do is to... call s3 client... upload file... and this is the file we are going to upload which is this one runs zip then the bucket name which is exactly the same bucket as before... something like this... and then we need to specify what's the object key in which we are going to save this file in the S3 bucket and remember we created another directory which is out and this is where we are going to upload the results and this is going to be something like out runs dot zip right and these are the results we are going to save from this training process and that's pretty much all now let's execute the file again we are going to.. we're going to create the zip again we're going to create the file again but it doesn't matter we're not going to comment this section because otherwise it's going to be very messy let's just execute it as it is and we're going to create the zip file again and then we're going to upload it into the S3 bucket so let's see what happens... S3 client is not defined obviously because... I am going to... I'm going to comment everything but starting here right so we can just use the same s3 client... let's see again... okay now I didn't get any error and now if I get back to my browser and to my a console and to my S3 bucket you can see that if I open the out directory this is the S3 bucket that we created for this tutorial and and if I open the out directory you can see that this is the file we have just uploaded which is runs. zip so everything seems to be working just fine and now we can just continue with the next step in this process which is... sending the notification to SNS so we are almost there you can see that we're moving one step at a time we are moving super slowly we are testing absolutely everything we do super super super one step at a time and now the only thing we need to do is to continue with this process over here which is sending the notification to SNS so I'm going to do something similar I'm just going to comment this section over here so we can focus on creating and testing the... sending the notification to SNS and this is how we're going to do... the first thing we're going to do is to create a new SNS client and this is boto3 client SNS and that's pretty much all, then we are going to say something like SNS client dot publish because we're going to publish a message and we need to input two arguments one of them is the arn of the SNS topic we are going to use in order to publish this message and this is Target Arn... we are going to set in None for now and I'm going to change it in a couple of minutes, and then is the message we are going to send to this SNS and I'm going to send a very generic message which is training completed right because remember the entire idea the whole idea of sending a notification to this SNS is to let know the user we have completed the training process so let's just send a very very generic message saying training completed and now let's continue so this is going to be pretty much all in order to send this notification so the only thing I'm going to do now is to get the arn from our SNS topic... so let's get back to my browser and let's just go to SNS this is the topic we created and let's just copy and paste this arn over here... okay so this is pretty much all now I'm going to execute the file again and now let's see if we are able to send a notification to SNS... okay we didn't you must specify a region... I'm going to specify the region where I created this SNS topic which is... us east one make sure you select the region in which you have created your SNS topic so in my case this is us east one but just make sure you select the right region in your case so I'm going to close it and I'm going to execute it again... okay it seems this is region name not only region... okay, this should be very quick and now everything is completed now if I get back to my email remember we created a new subscription which was this email over here I'm going to check if I received a new email and this is the email I received just now training completed so everything seems to be working just fine and you can see that we have almost completed this process because sending the notification to SNS was the fourth step in this five steps process so we are almost there to complete this process the only thing we need to do now is to shutdown this instance and this is going to be super super super useful because this means that once the training process is completed and we have uploaded all the results to the s3 bucket and we have sent the notification to SNS then we do not need this ec2 instance anymore and then we are just going to shutdown this instance, we are going to terminate, it because that way we are not going to get charged by AWS anymore right, and the only thing we need to do is to call os system... sudo shutdown... h now okay so this is going to shut down this ec2 instance and remember when we were launching in this ec2 instance we activated a setting which made that once we shutdown this instance this is going to be terminated so this is exactly the command we need to execute now, so the training process is completed... the process the entire process of this tutorial is completed and now the only thing we need to do is to test the overall process right so now what I'm going to do... is to uncomment all these sections over here here this is so exciting because this is the last part in this process and this is going to test the overall project the overall system so let's see what happens, we have uncommented absolutely everything and the only thing I'm going to do now... I'm going to change the number of epochs because now we're going to make a much more comprehensive training right now we're going to train for 20 epochs this is going to be a much better object detector and this is going to be pretty much all the only only thing I'm going to do is to remove... the data we downloaded... and also the data out directory... and also the runs directory and let's just... let's just remove everything so we make sure the entire process works as expected okay now I'm going to remove runs zip as well... okay now everything seems to be working just fine... and this is going to train the object detector for 20 epochs and then we are going to upload the results into the S3 bucket and then we are going to send the notification and then we're just going to terminate the EC2 instance so everything seems to be working just fine but let's make it like this I'm going to make it in another screen so I'm going to type screen and the only thing I'm going to do is source venv bin activate and I'm going to call python main.py okay so this is going to train an object detector for 20 epochs, I'm going to detach from this instance, from this screen, and I'm going to exit and that's pretty much all now the only thing we need do is to wait until this process is completed and once it's completed I'm going to receive a notification to my email and that's going to be pretty much all okay so this is the new notification I got when this training process was completed so everything seems to be working just fine and if I go to my s3 bucket you can see that this is the new... the new file we have created runs.zip and also if I go to my EC2 dashboard you can see I have... I no longer have the EC2 instance we used in this project so everything is working just fine so this is going to be all for this tutorial my name is Felipe I'm a computer vision engineer this is exactly how you can train an object detector using yolov8 in an EC2 instance and this is exactly how you can create a system like this, so this is going to be all for today and see you on my next video

Info

Channel: Computer vision engineer

Views: 1,213

Rating: undefined out of 5

Keywords:

Id: 2xmpCchQrHQ

Channel Id: undefined

Length: 39min 17sec (2357 seconds)

Published: Sun Dec 17 2023