Mastering YOLO NAS: Object Detection

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
are you interested in Yolo Ness is a model that Sur bassed YOLO V8 and YOLO V5 in object detection do you want to learn how to train on your own custom data if so you have come to the right place hi my name is ham or you can simply call me Sam I am an a journey of learning and exploring machine learning here in my channel by the end of this video you will learn about what is Yol and what it's better in Yolo V it in some areas who built it and how it works and we go to dive in creating environment installing what requirement that we need collecting a data set and adding the different cases that you might be encountering like the case one if you don't have thata set and you don't know where for you start I'm we're going to show you how you can I'm going to show you how you can collect a data set how you clean it and annotate it and what the data structure that you need for the all of and second case you already have a data set but you don't know how to process it and the third one which where you can get a data set that you want to Simply grab and use then we're going to create the data structure for yell understanding the key the key concept of the training like ebox and Silent training and batches and preparing L for training the training our first model on the custom data set that we have and validate the model then test the model that we created on unseen data set and finally a small tiny gift by the end of this video that will help you in creating but model you don't need to know about bison or have a BHD in computer vision you need just to follow along and try your best to follow as I start from the scratch until we have a fully functional object detection model this process can be recreated to build any custom yess model for yourself or your client or a company the main advantage of fness its ability to detect a smaller object better than other model the net in yess stand for neuroarchitecture search using optimization algorithm like auton to discover the best architecture for the model and automate the process of creating a neuron Network and we going to see this in action when we writing our code for training the model DCI a is the company that created yess it claimes a NE generation object detection Foundation model and it used different technique and here say inside here and efficient Frontier object detection on Coco measured like here the efficiency of the yolon if you can see it here it's speeding up both yulu V6 which is yellow yellow V8 which is red and Bon and even v9 and V5 like as you can see y here is beating them all they use something called auton they use deep learning to find a new deep learning architecture it's technically as it's seen here as it's seen here when you start using gon it start to try to adapt based the best network for our model so again a better results and you can see it here in action actually like it's here trying to get a better accuracy when each every time it's trained some of the use cases that I found actually here on the yolon is going to object detection in real time I mean like it's basically the best when it come to object detection in real time because it have low latency it can detect faster and better than YOLO V8 and here here is the average m M on map on roof flow 100 V other model as you can see here the m and the S is beating up like V7 and V5 and V8 is like there is a slight Advantage here so in term of is it better or not yeah as you can see it's much much better than V5 and V7 but when it come to it there is a slight Advantage let's stop talking about the architecture and who created it and what's created and let's start diving in our next part in case you are starting from scratch and you don't know what to install before we start the first thing that we need to check is the requirement that we need to set up the super gradients Library super gradients Library it what have the yolow algorithm that we going to run and they are recommending that that you have uh 3.7 3.8 or 9 and torch 1.9 what I am installing here I'm going to leave all the links so you don't have to Google it by yourself just go to the link and download it we going to install I'm going to use the Cod 12 here I'm going to show you how you can install it easily and also we need a visual studio code the editor that we are going to use and finally we going to install byon [Music] 10.11 and this version I needed because I noticed that the newer version like the point the 3.11 and 3.12 having some issues with the super gradient Library this is like the the best version that I could find that can work with super gradient so install it in Bon by coming here if you are using Windows I of course you need this version 164 which is recommended and if Mac OS it depend on your what you're using I think probably 90 90% of you using Windows so we are going with this version for visual you could just hit this button it will handle the installation and for torch what we are going to do is here selecting stable windows and B and it's already selected for us then Bon instead of Coda 11 we're going to select the 12 and we're going to copy this here and we're going to go to any empty folder that you have after installing of course bison gonna hit CMD we're going to copy the entire code line that we got here but I'm going to remove bip three and after install I'm going to click D Dash upgrade so like this and hit enter this going to handle the installation for the Koda it's already installed on my PC so I don't need it honestly you do this step after you are installed everything you're installing bison after you're installing Visual Studio code and finally we install code after installing what we needed we're right now going to go inside any empty folder possably the one that we're going to create the project in after you install Visual Studio code we're going to open this folder with it like click in any empty space inside the folder click here open with code and it will open for us I need you to install a certain extension it's called the Jupiter you're going to need to install Jupiter notebook renders this one will help us to basically create a Jupiter notebook inside our Visual Studio code and this is just avoid using Google notebook and I don't hate Google notebook at all actually Google collab is one of the best free notebook that you can use but I'm avoiding it because because my current machine is way better than the notebook available for us so after you install this we going to to come here and control shift p on I think in command shift B on Mac and we're going to select to create new jupitor notebook okay that's it that we have created our own new uh Jupiter notebook here we're going to save it and name it whatever what you want in my case I would call it yolo n it doesn't matter what you call it and select here click here to select the kernel and this is where it come that we need to create an environment and how we going to create environment in any terminal where I am using the visual studio code by clicking control and The Hyphen we're going to install B install virtual EnV this is the one of the way that you can create environment inside bison without using Anaconda then we're going to create the new environment you call it whatever what you want you have to write bison M VNV and call your environment whatever what name much you want it going to create a folder with the name that you want to create I'm going to call it yolo NV and voila we have our environment created over here why I'm creating an environment for people who's using Bice in daily basis you might have like an installed already libraries that you want to keep and don't want to miss with the and this way we can install any Library any different version that you want without it missing with the current one we have it will not touch them other version of libraries that we have like n or bandas or metlab PL we're not going to be touching it we're going to install the library inside this environment but before we install it anything we going to have to activate activated it it is super simple you just have to write the name of the folder then backward scripts then active vate do bat if you are using Windows and Bam hit enter it's activated how do you know it's activated because right now it say the name of the folder before the direction of the current folder so that's it we are right now installed the environment installed and what we need and we can right now install what we want inside this environment the first thing that we're going to install of course is sub bar and we need uh to install superv Vision also super vision is one of the coolest Library I have made a video about it you can check it I'm going to leave link for for this video down below hit enter it will start to install it with take some times and now let's talk about our data set there is might be different cases like the easiest one following me along and getting the data set and you can get it from roof flow there is also two different cases one that you already have data set but you don't know how to clean it and annotate it and I'm going to explain this how you can do it and what is annotation and might be your other case that you don't know what is a data set at all and how you can collect it how you can process it so we're going to start from zero what is data set in machine learning a data set is it refers to a collection of data that we use for training and evaluating the model that we're creating it's basically a structure or sometimes semi structure set of example I want to like to imagine with me you want to teach a robot how to organize different type of animals or logo or Lego sets to do that you need to show it before you ask it anything what type of animals in a picture like you're going to show it a picture of a elephant until it thiss an elephant and it we're going to keep doing this with other different animals until it learn so in this data set the picture are like Clues and the names of the animals are the answers the robot will look at the picture and try to figure out how we organize each animal correctly it will learn the pattern and the shape of the animal and the color and the feature that will make it identify the right answer and the data set is usually divided in three bars the first one is training which the bar that we mostly important the model look at and we going to use it to train the model with the second one is like a mini test we call it validation set it's a smaller amount of images but we technically tell the model like validate that your how good you are with this set and the last one is like the final example with a model never seen before it's called test set or the Unseen set it's images that the model never seen before and we're going to use all of them when it's come to a process of collecting data sets we going to create a a small F folder over here we're going to call it data set inside we're going to create a new file this file you can call it whatever what you want we call it images maybe. pi and and we going to install something for it okay remember that like in my case I'm here you're working on the I'm going to install a library called symbol image download the version is 0.4 why I'm installing this it's not possible that we go into Google and hit like for example air blanes and we go here and basically copy this image copy this image copy this that's not possible for us we need like a 100 image and maybe from this 100 image we're going to use like half of them so we are using or automating the downloading for our images using this code I will provide it for you of course if you want to collect your own images we're going to import the symbol image downloaded as simp I mean we can make this joke about Mr s or keep going we going to create a class for it then inside here we can put as many images that we want to download like air bles it will do that for us and inside here inside every keywords here we're going to download at least 200 how to run this file it's simple you can run it from here like the button over here this going to start to download after you hit this button it's going to run here we can also write here Python and it's going to start to download the images and collected inside symol images folder and inside every keywords is going to create a new folder for us right now it's downloading so it's going to take while in my case I'm not using this but like I'm going to show you how it's basic basically used in real world when you collect images so going I'm going to stop this I'm going to show you how you can right now annotate an image there is an tool called c. a you can use to annotate the images or you can use a platform like Robo flow but in our case I'm going to use this stuff locally what we're going to install in a library called label image and the I is C enter it will start downloading now we are radi annotation but before we annotate any images usually there is going to be some messed up images that we not needed which I think is called the cleaning part this one and this one and this one we don't need and this one there a poster for a movie we don't need also and we know don't need to see only the wing of them airplane so you do that with all the images that you don't need actually I'm going to keep the rest of them just an example to show you how to do anate the image and we're going to go go back here and hit the tool that we got we're right now going to open The annotation uh image tool that we got open directory here we're going to select this directory that the images inside which are case it's inside your nice images symbol airplane and here we going to change the SA directory and what I wanted you to to create is select the folder that we created our y inside data set create a new folder we're going to call it labels inside labels we're going to select this folder and after that we are ready to go first thing that we're going to do make sure that you're selecting the YOLO format here cuz this is the format we're going to use for our model if you check anyone like the ml or the bascal no we need YOLO come to the create a rectangle box draw a rectangle around the airplane as much as you can and here we're going to name whatever what we call it airplane what this is the name of the class actually we're going to need it for later it o and then here say next image can ask for yes or no to save it and you're going to do the same thing here create tangle select the air Blan class hit o and so on and so on until you are after you are done you're going to go to the we have here the images but we went to the data set folder that we created we're going to find our own annotation files every one of them is the name of the image that we already annotated let's open it and check what we have here it always start with the label of the class right here then with the center of the X and the Y of our rectangle box then the width and the heights of the bounding box it have the same name of the image if it doesn't have the same name of the image we are screwed honestly it's going to result on an error might even don't work so that's it for annotation the images after you are done right now like we're going to go here to the data set create as three new folders one for train one for validation and one for test I'm going to call it test and side labels we're going to take these two actually we need actually cut this one B it here inside this classes text there is going to be the name of our classes in order which is very important we're going to B the labels that we got here inside our train folder and we're going to create a new one here called image what is our goal actually here is to put the images that we got labels from here is 10 and nine inside the folder images and we're going to do that right now data set train images and list what we end goal for that having two folder inside each one of this one for label or for images so you end up with three folder that have two folder inside it one for images that contain all the images and one for the labels and contain All The annotation files and finally we're going to get this classes which you going to need if you are going to create the ml file inside here we're going to create inside our set a new file and we need three thing a four thing here we're going to need to know where are train images and validation images and we're going to put the links for them in here relative bastic and here we're going to put the number of classes and see basically number of classes we have one only one and we going to call it it's airplane and that's it that's how you create the ml file this is how you can create a data set from zero and right now you know how to clean it how to annotate it from to get from where to get it in the first case if you are following along and you just want to get the same data set I'm using I'm using the robo flow Shas pieces computer vision data set I I'm going to leave of course the link in description if it have a two 289 images so it's see it's old I think it's outdated but it doesn't matter honestly we can use it anyway for sake of this tutorial have 13 classes and all of them listed here you can download it by clicking here of course you have to create an account it's very simple sign with Google um email and it will let you in it's going to show up this popup I selected the YOLO V5 from here and ask it to download a zip to computer and it will automatically download for me I already downloaded when you download the zip file it's going to look like this you extract it to the same folder and you're going to get this name you can modify it but you inside it you will find three folder and one file forl inside each folder you're going to be the same structure one for image one for lab and we're going to use them all actually what I did for the sake of this tutorial to be shorter I cut down the amount image here to 180 so I removed a couple of images from here in the training uh images and also the same one I removed removed it from the label here is going to be the notation f file for each image that we have inside the ml file you're going to find an ass symbol two links for the link for the images uh inside the train and validation and the number of classes and the name of classes we're going to copy this actually going to need it for our training so we need this file and that's it for the case number one and in case if you don't want to use the same one that I'm using mostly it doesn't matter if you're using the same data set or not you can come here to Robo flow Universe you can find a lot of data sets here like some of it are actually funny like they never going to give you up the seral dogs blind detection there is a lot here like you can pick whatever what you want here and download it and use it and a common different format mostly what we aim for is the YOLO format because we are using YOLO now now we are ready for diving into our code first thing that we need to know is our GPU is ready and what type of it right now I imported torch that's the first thing that you're going to do and torch version we need to confirm which Tor version that we have we're not using the CBU version because if you are using the CB version this going to take forever for training the model my current device is RTX 307 I mean it's not the best but like it's better than the Google T4 uh I'm starting with putting the main variable that we going to need later I'm going to explain it right now so Focus the first thing is the model what type of model we have y have different model large small and medium I'm using the L which is the large one better for giving you a better result the bigger the model the more time it will take the better the quality it will be but a bad size is like imagine you're looking for 12 images at once that's basically what happening to the model the bigger the number the faster the model that can be trained but this is really depend on your machine I mean I can increase it to 16 and still 15 I think and 16 and it's still we working with me but sometimes in some cases it will not work in like smaller GBU my current GBU I think it have a 12 gab so it can handle it and here right now this is the max ebox the ebox is that how many time our model will look to the data set that we have the bigger than the number which is recommended is the better the and higher quality the model will be in the our case I'm using a very low number it's a shallow training I will not advise anyone to put the number below 10 but for the sake of this training time and video I bought it in a shallow number anything below 10 is shallow number so the second thing is actually Max ebox and ebo is the same thing but like I using this to explain it the max ebox is a same one is a ebox and we're going to put it down there going to explain it understand and you're going to understand what is the different here Shake points directory this uh whatever relative bass that you want I mean the direction bass for the folder that you want to Output our models inside it will be here and the chest pieces is basically the name of our project model so you can call it whatever that you want you can like but here anything that you need let's start up and running with our current model let's see how it works we're going to import so gradient and trainer the expert name is the project and the shake Point directory is the shake points then the data parameters is basically an object that you can put all the data details inside it like the data Direction and the data folder B where it is the data folder that you have and where is the train images and the labels where are the validation folder images and the DAT the validation label folder so on for test and eventually here we're going about the class you might ask me where did I get all this names actually it's just inside our data set that I got from Robo Flow side data ml there is here to tell you the number of the number of classes and you can just copy paste this array here inside object then we're going to import the data loader one for training and one for validation this is already given by super gradient the first one is the training format that's for you all format and the second one is the validation we're going to do create three data loaders one for train one for validation and one for testing the validation and testing is the same one the the only thing is different is the data direct like here we accessing the object that we have and we give tell it with the data directory the validation folder that we going to use and the same thing here for training except that we are using the detection loader for the training here using the validation and here using the validation also we given it the train folder images and the label here so far so good if you're feeling like you're missing something please rewatch this part right now we going to look our side data set that we already prepared and look what inside it like here we have inside drain data we can play with the Transformers this Transformer actually given by super gradient and you can experiment with it play with it one of the thing that I did with it is transforming I did play with the degrees of the images so as you can see I tilted a little bit made it like I will not say clean so it's going to be better for us that the same image have a different kind of angle this is basically happening when you have a very small small data set like 15 images 40 images or maybe like less than 100 images you can start plays with the data set angle scale can play with the scale we have here degree and you have the scale here we can play with it the size of it but I will recommend to play it keep it the size as it is and like try to play with it and learn how you can modify it a little bit I'm going to leave a link for description about the documentation what else that you can do with it right now we're going to create our bation and set up the model that we want to use as we set it above it was the Yol large large one the L1 and our data set we already need to give it here the number the classes and it's it's you can just give it the the links of the data parameters classes you don't have to insert it by yourself and the bre trained weight is the coku data set okay the three training parameters this is the longest part about training code and it's not written by us not all of it it's getting by super gradient this is the part I talked about in the beginning the part that adapt the nrow network of the model to get the best result possible and I'm going to explain it as simple as I can the key cones of it the most most important part that you should look for before you start the first thing the silent mode here this is simply like telling you if you want to see this progress Spar or not like in my case I have a very low ebox number so it doesn't matter but if you bought in it in 100 ebook I don't think that you should see it set it for true so you don't have to see the BR of course we want to get the best average model and here we are having something War up mode okay it's l linear and also here there is a something called warm up ebox I don't want you to get confused but by this when I started the training you as the first time I saw it doing more ebooks than I asked it for which is normal because here you can do some warm up before the model start like actual training and here is only three so if you have 10 aook plus three it's 13 and so on here this all the stuff is about the network this is getting from Super gradient documentation I'm going to leave it in the description you can actually set it for as it is it's best result that you can get from here then we're going to B the max ebx which I already talked about how many time our model will going to take all to our trained images and here we're going to put in losses a function this function I'm already getting it from the documentation it give it the the Statics assigner true or false the number of classes that we have because the losses depended on the classes because I have S 13 class here and here the rig Mac also we give it the validation Matrix list this is also given by super gradient don't worry about it but like you have to know that is this two the validation Matrix and Matrix to watch is have to be almost the same number if it's 15 it's 15 right now this is the done the hardest part about the training done now now you have to run the trainer that we created and B it all the stuff that we get we created above the first thing that trainer need is the model the model information okay then we give it the train parameters that we just talked about then we give it the train data and the validation data this is the loaders the stuff that we created over here and above here this stuff the validation data and the train data we pass it to the and finally after you H this cell running it depend on the amount of ram that you have the ebox and the batches how much it will take in my case I have the RTX 37 and it did took about 5 to 10 minutes for each ebook I have batch of 12 and the ebooks is five plus with three like it led to 10 ebuck doesn't matter honestly but it took it about 1 hour to finish the training and I didn't touch my machine until it's done if you have a better machine you probably going to have a better time less than 1 hour if you're getting the notebook might be take much longer or less time than I have also I have 180 plus images so this is going to depend on the amount of images that you have after it's done we going to go to the shake points folder which is the folder that I created I bought it inside the main uh partition that I have the checkpoint access the project and the last training run that I have the best model is going to always be have the best name inside it we're going to call for models of get the YOLO nice large one here okay in the number of the classes and the shake points that we have we're going to start use the bestas model first thing that we test it to see the losses and all the stuff here we going to test it see the losses L iio the normal losses I mean it's not perfect model and I'm going to show you why because you're going to create a matrix to test it but it's all right I explained here what each one does and do in short term but then we get the need bar that we always wanted to get to is running it a prediction on the model they created I did you can get any two images from the from the online and B it in an array and get this images to aict and it give it the confident levels and the I threshold that you want and tell it to show it give me this results mean it's not super perfect but remember this is a shallow training it Le in know 10 ebox it's decent result if you thought about it mean missing few things here but yeah it worked you can get also the image prediction results like the prediction the label ID label name the confidence level and all the stuff using this code it will not show up on environment this like this one that I set up because for some reason Visual Studio code doesn't like work with it if you switch it to a bison environment it will work but also you have to rerun the entire thing again let's do the same thing but for a video I got a couple of videos from pixel's website and you can just tell it that I want to run this video and give it this pass the pass that you want to extract it inside and tell the best model to the device predict the input of the video the link of it and save it in here actually did give me this um detection video it detect this one only I understand why the shallow training also I put the confidence level here a little bit high but you can run it on different videos you can do that it's easily right now the last thing that we're going to do is do an um Matrix valuation for this module Matrix valuation for this modle we're going to use supervision and one metric we're going to supervision I talked about it in a video for if you want to know about it and how to use it it's super simple library that do a lot for us I'm using the detection uh from YOLO and set a threshold for 50 and here I'm doing production on this um images that we have then I'm going to B it to on on Matrix and here this results for every class that I got if I was going to do this on a real model is going to be about 100 EBU hly that is how you can train your own custom model and how you can collect your data set if you want to know more about the YOLO or supervision or how you Auto annotate your images shake the links that I will leave you for the videos in this in this description well I have created a very good video about ulo V8 how you can train it from the scratch and also I created a video about talking about supervision and how you can use it and we created two c project with it and I did also do an auto annotation video using Auto distill basically it go to entire the images and do annotation for you and label it for you without it you have to do anything at all thank you for watching I hope you enjoyed and learn a lot I'm going to leave you in the description a small gift which is three ideas for a unique computer vision project that you can apply either YOLO V8 YOLO V5 or yolon you already have the information how you can to train your V8 I hope that I earn a subscription and like for in this video and thank you for watching see you in the coming video
Info
Channel: Codewello
Views: 2,825
Rating: undefined out of 5
Keywords:
Id: ZsQUXZvso4E
Channel Id: undefined
Length: 38min 15sec (2295 seconds)
Published: Fri Dec 29 2023
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.