YOLO Object Detection Using OpenCV And Python || Python Project

Video Statistics and Information

Video

Captions Word Cloud

Captions

foreign [Music] [Music] hi you are in the course that will teach you to develop and deploy the custom of the detection model with yolo and you are watching the final output of the course where we train our model to detect 20 different objects like person cars buses bicycle airplanes Etc and everything from scratch with the help of this course you can almost detect any object in the real time on your own this course is one stop solution for who like to learn develop and deploy the complete end-to-end custom object detection with your own hands in this course you will acquire the Practical knowledge and that will turn into the highly skillful Talent which is invaluable for the future carrier I am very excited that you get the right place at the right time so let me walk through the content of the course first you will learn about the theoretical concepts of YOLO and the metrics that were related to the object detection and then we will put all the concepts into reality by first collecting the data from the different sources second with the collected data I will show you how to do the labeling to the images then we will start doing the data preparation with the prepaid data and images we will train the yellow V5 model finally I will teach you to extract the detections and do the bounding boxes to the images tattoo in the real time this course is well designed and each section is continuation of previous section at the end once you complete all modules you will become ready and will be comfortable and start doing the projects on your own sounds great then what are you waiting for let's get in the course together we are going to see how to install python the first step when installing the python is open your browser and go to the search engine google.com and here just type python click on the first link which is available that is www.python.org which is the official website of python click on that now you have entered into the python official website in order to download the python what you have to do is that click on this downloads and click on the latest available python for Windows in my case it is 3.9.6 click on this button to download the python alright you can see that I have downloaded my python software okay I have downloaded my python software in the desktop now what you have to do in order to install that right click and click on run as administrator [Music] click s make sure you check the path at python 3.9 to path click on that 3.9 is the word the version I'm installing now now click on install to install this python setup wait for a couple of minutes until you complete your process all right now setup was successful and you have successfully installed python close it in order to make sure your python is installed or not what we have to do we will test that in the command prompt for that go to search and type CMD and click on the per command prompt here now type python as soon as you execute this command you can see that Python 3.9.6 and you can see this message which means that you have successfully installed the python and you open python shell successfully so this is how you need to install python now type exit to close the python shell let's install the virtual environment for that first you need to go to your current working directory I'm in this folder where it is a YOLO object detection and notes and this is my current working directory where I am going to create all my folders and codes in this particular location okay so here in this location what I'm going to do is I'm going to create a virtual environment for that you need to open your terminal or command prompt for Windows you can click on this path and type CMD which will open the command prompt and you can notice that the path here mention is the path of your current working directory now what we need to do is you need to create the virtual environment in order to create the virtual environment type the following command python hyphen m v e and v stands for virtual environment now we need to provide the name of the virtual environment you need to create let me create the name of the virtual environment to be YOLO underscore v e and V now press enter it will take the few seconds to create the virtual environment please be patient while you complete the whole process all right we have successfully created the virtual environment let's see how to activate our virtual environment in order to activate your virtual environment for Windows you need to press Dot and for Windows you need to type the name of the folder which is Yolo EnV backward slash and type scripts backward slash activate press enter and which will activate your virtual environment you can notice that which is Yolo hyphen unv which indicates that we have successfully activated our virtual environment if at all you are using the Mac or Linux use the following command which given here which is Source name of the virtual environment and followed by bin and followed by activate that will activate the virtual environment in Mac or Linux environment let us also install the required packages for this course first minimize this and in this folder what I'm going to do is let me create the new text document and name this as requirements dot text okay now open this and here write the following libraries that you want to install in my case I want to install numpy and I want to install pandas matplotlip and opencv hyphen python the next library that we're going to install is Jupiter notebook Jupiter is the ID that we're going to use throughout the course and the last library that we're going to install is label IMG label IMG is a powerful annotation tool particularly for labeling the images more importantly it is open source that's it there's another libraries that we're going to use in this codes let's save this and close the requirements.txt now open our terminal or command prompt tap the following command that will install all the requirements in the requirements.txt which is PIP install hyphen r requirements dot text press enter which will actually install all the packages that was there in the requirements dot text we will discuss about the precautions to be taken while doing the data preparation data preparation is very important task which will actually affect the performance of the model so always we need to do the double check while doing the data collection and data labeling so here are the do's and don'ts and data collection let's discuss that there are three major rules we need to follow while doing the data collection that is it is always recommendable to use the high definition images if at all you cannot able to get the HD images then make sure that the height and width of the image should be at least 500 pixels the second important point is that always avoid the blur images while collecting the data and third thing is about the Clutter background images for an object let us discuss one by one first let's look into the high resolution images what I mean to say about the high resolution images is this the picture that you're going to see on the left hand side is the good resolution images whereas the picture which is on the left hand side is the bad image where the width and height of the image on the right hand side is less whereas the good image is having the very good height and width of the image always it is good to choose the high resolution images or high definition images the second point is we should avoid the blur images as you can see in this picture we have the objects that is the cars or vehicles or being blurred it is not recommendable to select such images next is the Clutter background images this is all the bad image here you can see the background of the image is completely Disturbed and completely clutter so avoid such kind of images suppose if at all you are having the many objects are there in the image so this is what the good image you can see that you can able to clearly see the object for example the person and also the cars very clearly and also the background is pretty clear so always we need to select some images something like this here I given you some list of Open Source websites where you can download the high quality images obviously the first one is Google and the next one is a Flickr and splash and pixels or these and all the four open source websites where you can download the high quality images for free okay let me also look into the do's and don'ts and labeling labeling is one of the most important step in the data preparation while doing a labeling process it is always used to have to follow certain rules the good labeling is something the picture which is on the left hand side and the rest are the two pictures which is shown is this and this or the a bad example of doing the labeling for example in this image you can see that our box bounding box we have completely covered the object whereas here we just have completed our object it is always recommendable whenever you have some object and you have to completely do the bounding box and this is the one of the example of the bad labeling is something like this actually in this region our object is there but if at all you want to label this and this will actually mislead our model okay so it is always go to to the labeling something like this as shown on the left hand side in this lesson let's start the project that is building the custom object detection model with yolo V5 and the step number one here is we need to collect the relevant data for object detection as I said in the previous lesson we are detecting the 20 objects that is person car chair bottle sofa bicycle Etc so like this we are detecting the 20 objects if at all you are having your own objects make sure you collect those images only alright so let's begin this by collecting the data let me open a folder and in the previous lesson we have created the virtual environment in this folder so here what I'll do let me create a new folder and name it as one data preparation okay and inside that create one more folder name that as data underscore images so what are the images I'm going to collect I'm going to put this into this particular folder all right in order to collect the images as I said in the previous lesson I've given you the list of websites when you can able to download the free open source images like Flickr Google pixels Etc let's open my browser and download the images okay so first go to Google and type whatever website you want and my personal favorite is Flickr okay and click on clicker and here you can download what are the images you want for free so let's say I want to download some images of car type car and you can find the high definition images of the different images of car and so on so what you can do you can simply download this for example this looks really nice and having all sorts of cars you can simply download this image by click on download and make sure you download the high definition images something like at least the large which is 1024 by 657 and you better put that into your current working directory that is this that and paste it here and name this as let's say 0 1. okay all right you can also see some suggestions something like this and what you can do you can also download this and this is the largest to large let me download some medium size and make sure you cut this and paste it into your respective directory and rename that as 02 like this you need to repeat the whole process and download the as many images as possible it is always good and recommendable to download the images at least 500 to 600 images per object for example if at all you want to train this for car you need to download the similar images of car at least 500 images okay that's what you need to do this is manual process and you need to spend the time as much as possible here in data collection make sure the data that you're collected is high definition images there are also few websites where you can download the entire data sets for free which is Pascal vivoc and go to Google and type the Pascal vivoc data set and click on the first link which is the Pascal visual object classes homepage where you can download the different Pascal VivoCity challenges from the 2005 to 2012. if I click on the challenges of the year 2012 and you can find the different data and the detections ETC like uh this is the original image and they also labeled it for you if you click on this and you can see this is my original image and this is the label Etc as you can see that this is the head hand food Etc so everything is labeled and you can also use this kind of images too okay all right this is how we need to get the images as much as possible you can also download the images in the resources the which are the images which I am working on I have successfully download almost all the images for all the objects and here are the all the images that is downloaded and you can find these images in the resources okay now the next step is we need to do the labeling means we need to locate where exactly our object in this image for example this is the image we need to locate the object in this case it is car so that's what exactly we're going to do now and remember this is the completely a manual process and there is no automation involved here so you have to be pay attention while doing this process right now what I have to do here is that if you remember we already installed the virtual environment on all the required packages and now what I'm going to do is let's first activate my virtual environment click on this and type CMD which will open my command prompt into the respective directory now the next step is we need to activate our virtual environment in order to activate the virtual environment type the following command which is start and this is my YOLO v e and V scripts and activate this will activate our virtual environment now type the label IMG which is our annotation tool that was using in this course type label I capital i m g now press enter all right so this will opens an annotation tool which is a UI something like its looks something like this let me minimize this a bit cool so when you have some all the options and Etc so first step what we have to do is you need to open the directory so click on open directory and now navigate to the folder where our images are there in my case it was the one data images and this is what the list of images I'm having with me I need to navigate to this okay now just select this particular folder which is data images now click on select folder cool now here we go we can see on the right hand side and which are the list of the images that we are having everything is there here now let's do the labeling so here the one thing we need to remember is that the settings that by default it is Pascal vivoc make sure you set this to Pascal VOC only now click on the create rectangle box and locate the object wherever it is for example this is my car and this is my car and just select the Box drag and drop now now you need to type the label this is my car now press ok similarly do the rest of the things like create the click on the rectangle box and Mark my yellow color car which is this and okay this is also car click on OK and again click on the rectangle box and select this car this is my car and click ok this is also car and okay so if you can't able to see the complete card this one it is not required and that's it and now what we have to do we need to save this for that you click on this and now uh it will save in the same directory which is zero zero one dot XML and one thing we need to remember is that do not change the name of the file what are the name of the file which we got just keep as it is now click on Save that's it next click on the next image and do the following and this one is the car and this is the car okay and also this one it's also car yeah Okay now click on Save and Save click on next image car and click on Save click on next image and here we have another thing now first one is resistant person the person and one more person one more person okay and also this is the horse that's it now click on Save that's it so like this what we need to do is you need to keep on repeat the process wherever you find the object and mark the rectangle box and tag the name of the object to it that's what you need to do for all the images whatever you are having with you okay this is how we need to do the labeling process and remember this is a completely manual process and even a small error will cost you the Precision of the model so make sure you accurately draw the bounding boxes of your object okay you can also download the images as well as the XML files along with the images from the resources I have successfully labeled all the images that was there with me you can find the images along with the XML files in the resources you can download this in the resources and the resources okay what exactly is there in the XML file this XML file will actually store the information of the object along with the file name so let me open this and let's look into that you can open this XML file with default text editor or you can also simply open that with a simple notepad I'll open that with the notepad and here you go you can see that some kind of attacks are there so this is annotation and folder tag file name tag path Source Etc so what exactly having here is that the file name tag basically store the information that's the name of the file and the size tag will actually store the width and height of the image next there is an object tag is there this is actually we are looking for this objects tag stores the information of the bounding box like X-Men y Min X Max and Y Max this basically represents the bounding box and about this you can see the name which indicates the name of the object is car since in our image there are multiple cars out there or you can say that there are multiple objects are there so that's the reason we have the multiple object tags so this object tag must represents the car and here you go this is the X-Men x max y Min y Max positions are there here all right so that's what exactly we are storing in the XML file now what we will do is that in this lesson we will first load all the XML files and from all the XML files we are going to extract the size going to extract the name of the file and the bounding box information that is the X-Men x max volume in and Y Max finally we'll also say the name of the object that's what the task that we're going to do in this lesson all right so obviously this thing we need to do it in Python so what we will do is let's open the Jupiter notebook and write some code in order to extract all the bounding box information of the image all right so first step what we will do let me open the gbt notebook by just tapping the CMD and it will open the command prompt but the path is already set now type Jupiter notebook that will actually open subject notebook here we go I'm in the Jupiter notebook now let me create a new notebook in order to create a new notebook click on new python3 now name the notebook as 0 1 extract object info from XML all right so in order to extract all the object information we need to import few necessary libraries for that let's import OS and next also a mode from glob iport glob and import pandas as PD this is particularly used for the data analysis and obviously I'm going to use the reduce function which is actually very essential for that from function tools import reduce okay and in order to read the XML file I'm going to use XML so from XML Dot E3 import element tree as 80. okay now execute it okay this is function tools done we have successfully imported all the necessary libraries all right the next step is let's load all the XML files that was there in the data Hyphen Images folder okay load all XML files and store in a list all right so XML list equal to glob now here I need to provide the name of the path in which all maximal files are present which is in the data images and now I need to provide some regular Expressions command so that I can able to extract all the XML files first one is star which indicates all and ending with DOT XML that's what my regular expression command and with this I can able to return all the XML files in our list let me see that okay so let me print the XML underscore list and here we go we have all the XML files that was stored successfully in a list cool right and the next step what I will do is that I'll do some basic cleaning that means here you can see that there is a double slash this will basically occurs who are working on the Windows operating system the remedy for that is just replacing the double backward slash with single backward slash okay so what I'll do is simple data cleaning so here replace double Blackboard slash with forward slash for that I can use a simple Lambda function so the Lambda function will be X such that so let's say this is my X and once I got this x I need to replace my double backward slash with the forward slash that's it this is my simple Lambda function now let me use this map and followed by the function and it and the list of the values which is there in my XML file now I can able to convert everything with the double with double backward slash with the forward slash let me save that into the same file again execute this and execute this and here you go that's how we can able to convert the path with the double backward slash with the single forward slash cool right welcome back the next step is we need to read the XML file and from that XML file we need to extract the file name size object Etc that's what we need to extract from the XML file okay so that my step number two read XML file and from that XML file from each XML file we need to extract first one is file name and from size we need to extract the width and height of the image next from object we need to extract the name X-Men x max y Min and Y Max so these are not the information we need to extract from the XML file we know that let's say this is my XML file using this elements we can simply extract those information before that let's see how to load an XML file and from that how we can able to extract this or parse it we will see that all right the first step is let me Define the tree is equal to e t which is my elementary which I already imported from this step now here et.parse and we need to provide let's say for the sake of Simplicity I will show you how to read one XML file this is my XML file I need to parse this XML file Next Step will be root is equal to 3 dot get root and once you do this and now we again able to extract all the information let's execute this for the sake of Simplicity let's print root and what you can see is that it is defining that we are in the element name annotation okay let me open my XML file and here we go we can see that we are in The annotation element now from this we can simply and from this we can simply call the element and we using that we can able to extract the text from it let's first extract the file name from this Okay so extract file name the first one is the extract file name weekends let me delete this we can simply extract the file name by taking a root and from this root we just need to find the path path defines attack we need to find the path of this which is nothing but the tag or element let's say this is my file name it should be exactly precise this is my file name I want to extract copy it and paste it here and now in order to get the text inside this you can simply do dot text press enter and we can see that we can able to extract the text from this and let me name it as image name so let's save this information in the image name similarly I want to extract the width and height of the image all right let me open this XML file and width and height of the image which is there in the size for that what you have to do is we need to find the size first and then after the size we need to find the width and height separately that's what we need to do okay let's see that bet is equal to from the root I need to find the size I need to find the size first okay and once I find the size then let's spread let's let's put dot and find again we I need to find the width of the image which is this width I need to call this width now with in order to extract the text let's put text let's see let's print width okay execute this it is one zero two four let's see that yes exactly we got one zero two four similarly let's extract height which is height and which is actually inside the size I need to find the height okay that's what I need to call here now let's print with comma height oh we got an error this is width which is one zero two four comma six five seven and that's correct so this is how we can actually extract the size the next step is I need to extract the name X Min x max y Min y Max from the object tag okay this is tricky process you can see that there are multiple objects are there this is object one this is object two and this is object three and so on so for this process I need to do some kind of a for Loop let's see that from the root instead of let's see that let's say if we have a single element we can use a simple file if at all you are having the multiple elements of the same on with the same name then we will use a find all let's say that root dot find all foreign the name of the element is object let me put in a variable objects now execute this let's see let's print the objects and here we go we have one two three four there are four objects out there okay from each and every object I need to extract the name X Min x max volume in and buy Max let's see one object in order to get first object let me put object of zero and this is my obj okay now we are in the first object which is this this is my first object from this first object I need to extract the name and inside this we have the bounding box in this bounding box seven extract x-min x max y Min and Y Max respectively let's see how to do that get the name can be defined as name equal to obj dot which is my object and from the object I can find the name and Dot text which will extract the text from it next one is I want to extract the bounding box information let's say BND box and I can able to extract from object find bounding box which is BND box and now this is my this becomes my object from this bounding box I need to find X-Men and in order to get the text I can simply get the text and this I can name it as X-Men similarly let me check the x max I can simply copy this paste it here this is my x max and Y Min and Y Max versus my y min and this is my y Max okay so this information we can able to put in a list so the list will be name X-Men x max y min y Max let me print this and here you go we have the car and this is my X-Men x max y Min and Y Max okay so this is how we can do for one object similarly we can also do it for the multiple objects by simply doing the for Loop so the for Loop will be for object and objects and put everything inside the loop you print this and we can get all the bounding box information from this now this we need to place here here okay so with this I can able to get all the objects and image name Etc what I want to do here is that in this list itself I need to I will put image name width and height of the image you can simply write image name with and height okay so let me name this as parser which is what actually I'm extracting it instead of that let me do one thing let me take the parser is my empty list is my empty list and append all this information into this so simply I can do dot append of parser and with this I can able to place all the information into my parser let me finally print processor okay height height done so we have successfully put all the information in the parcel let me put print like this and this is clean which indicates that from this image the width of the image is this and the object name is this and the bounding box information for this object is this so this is how we can extract all the information from the bounding box what I'll do let me put everything in a function so this will be my file name and from this file name I want to extract the get root and from this I can able to extract all the information so let me put this in a function and name this function as tract underscore text for this is my file name and I'm going to return parcel okay now execute this and with this my function extract text is ready let me apply it for all the XML files I'm having with me but that's going to be equal to parser underscore all equal to and since I just want to apply this function to all the XML files I can simply use this map map and the function name I need to apply is extract text the intervals will be my XML files let me convert this entire function into list and execute it this will take some while please be patient while executing this process we have successfully completed this process let me print my parser underscore all if you print that we actually get the information in the multi-dimension format in order to make this flatten I can use this reduce function and using the reduce I can make this entire thing into the flattened format for that I will use a Lambda function which is simple addition Lambda and we have let's say we have the two inputs X comma Y and the output will be X Plus y all right so this is what exactly we need to do and basically here we're using the reduce operation to do this and reduce this function and the triples will be the parts at all and this will become my data let's execute this and now let's print the data and here you go what you can see is that we bring the data in the two Dimension form where we have the rows and columns the First Column defines the name of the image and this is the width and height name of the object and this represents the bounding box information now we can convert that into the data frame something like this so the data and this BF equal to PD dot data frame and the object will be my data and here the columns will be first one is my file name followed by with height name of the object X-Men x max y min and Y Max and execute it with this we have successfully converted into my data frame and here you go we have the file name width height name of the image name of the object X-Men x max volume in y Max respectively now let me see the shape of this and we are actually having the 15 000 rows and eight columns cool let's see how many objects we are actually having here which is actually my DF of name dot value counts and which will actually give the information of all these things okay and here you go we have the 5447 objects as person and cars or 16 50 and so on so they send all the objects that we're going to train in this project in this lesson we're going to understand how to prepare the labels for YOLO model in the previous lesson we have created a data frame which contains a file name width of the file name xmin X Max and Y Min y Max positions basically this all the information for a bounding box for a earlobe as we discussed in the previous lessons that we need the information about the center X Center y w and H where Center X and Center y are the center position of the object and that is normalized to the width and height of the image similarly W is the width of the bounding box nor applies to the width of the image and H is the height of the bounding box normalized to the height of the image let me explain you more clearly by considering an image consider the image something like this and we have already drawn the bounding box information and we got X Min y Min and X Max and X Max and Y Max are the two diagonal points and based on that we can able to draw the bounding box for example in this case this is the bounding box and usually the notation will be the object X Min x max X Min y Min X Max and Y Max and assume that we have an image with the 500 rows and 300 columns and the representation of this bounding box could be the car means this indicates the name of the object inside the bounding box is car and x-min position and Y minus 550 50 comma 100 and X Max and Y Max is 220 comma 200. so that's what the information we actually have this we need to convert that into something like this we require the center position of the object and name that as Center X and Center Y and that need to be normalized to X to width and height of the image respectively where W indicates the width of the image H represents the height of the image and which is normalized to the height of the image all right so the conversion formulas from the X Min X from X Min Vitamin X Max by Max or following the standard X will be the X Min plus x max divided by 2 and if you do the normalized with the width of the image and we'll get the center X and similarly Center y will be the Y Min plus y Max divided by 2 and we need to normalize by the height of the image and width of the bounding box is going to be equal to x max minus X Min divided by width of the image and H is going to be the Y Max minus y Min divided by the height of the image all right so this and all the conversions formulas that we basically use once we calculate all this information and next is we need to store this in the text format so the folder structure that we need to store something like this let's say we have the data images folder and inside the folder we need to create the two folders one is for train and second one is test the train is basically used to train our YOLO model and test is to basically validate our results for example we have the images like001.jpg002.jpg so on and for each and every image the respective information should be stored in the dot EXT file for example the for the image001.jpg the respective bounding box information or the YOLO label should be stored in the 001 dot text and here it should be something like this so we need to store the object name seter X Center y w and H so those information we need to store all right so like this we need to store all the information in the text file similarly we need to do it for the test images too all right now let's go back to our code and let's do the conversions from the bounding box the YOLO labels and then we will split the data into training and testing and now what we need to do is that we need to get this information now let's see how to do that first of all let's look into our data frame so this is the data frame and also let's look into the info of this if you look into the info of this we have all the information in the object means in the string actually we required our information like width height x-min x max y Min and Y Max this information should be either an integer or in the float values so let's first do the type conversion and then we will apply this formulas all right let's get started first one is the type conversion first let's define The Columns what I want to do the type conversions his first one is width next is height next one is X-Men x max y min and Y Max now we can do the type conversion simply by taking the data frame DF and call The Columns that's going to be equal to same thing and do the type conversion is as type and which is my integer let's also look into the info again and here we go you can notice that we have successfully did the type conversions here which is the weight and we previously it is an object now we have successfully converted that into the integer values all right now the next step is let's apply this formulas and get the YOLO labels first let's define the center X and Center y okay so we have for Center X which is actually equal to you can look into the formula which is actually we are doing like X Min plus x max divided by 2 and again we divided by width of the image all right so which is first let's define the X min or x max plus DF of X min and we need to divide this whole thing by two and next we have to do the normalization with respect to the width of the image which is DF of width the same thing we'll copy it for X Center y which is simply Center Y is equal to Y Max Plus y Min divided by 2 and divided by height of the image let's define the width now the width is going to be actually the width of the bounding box normalized to the width of the image which is x max minus X Min divided by width of the image so DF of w that's going to be equal to DF of x max minus d f of x min now once we have this information next simply we need to do the division by width of the image where you can able to get the normalized value of the width of the bounding box similarly let's take the height which is exactly the equal just only we need to replace by y Min by y Max and Y Min divided by height of the image let's copy this and paste it here this is my hatch and which is actually the Y Max y minus y Min divided by height of the image done now let's execute this and let's print the top values of data frame and here you go you can notice that we have successfully converter or bounding box information into the YOLO label information now what I want to do here is that I want to split this data particularly the images into train and test set let's see how exactly we'll do that again let me take the DF and take the file name okay and here you go you can notice that we have 15 663 rows are there but ideally we are not having this many images let's look into the number of images let's rename that as images dot unique execute it now let's print the length of him okay let's print the images and let's also look into the length of the images oh sorry now this 5012 images I want to split that into train and test let's say I want to split this 5012 images let's say 80 train and 20 images for test now let's see how we can do that all right the first thing what I want to do is that I just want to shuffle the images the first what I will do is that let me check this images array and convert that into Data frame so for that what I will do is that let me take that image the app and let me convert that into Data frame and which is my images and the column names I want to specify here is file name okay now let me display the top five rows and which we have this and we have all the images are there now this images I want to split that in train and test okay let me name that let's say image underscore train and here I will take the sample of 80 percentage of the data I can use this sample and specify the Frac is equal to 0.8 and which will actually randomly select the 80 percentage of images or 80 percentage of data so I'll write down here is that it will first it will do Shuffle and random shuffle and pick 80 percentage of images okay now what I'll do here is that let me convert that image a strain into a tuple in order to do that let me say this is my file name and from this file name I can get the series of couple and I can get all the images in Tuple let me display this and here you go you'll get all the images eighty percent of images randomly in a Topper cool now the next step is we need to get the 20 percentage of images as test remember one thing is that we already picked the 80 images from this data frame and we need to take the rest of the images apart from this so for that what I will do is let me check the image underscore the app now write a query saying that the file name not n my image strain that's what it actually written you my image underscore test okay and in order to get the all the images in Tuple let me do the same thing what we did for the train and this will actually it will take rest 20 percent images it will take rest 20 images oh we need to give the format is f now execute this done now we have successfully get the images underscore test and which is actually the 20 images and images on the train which is my 80 percentage of images so let's look into the length of this and also let's look into the length of our image analysis code train now here you go this is my 80 of images and this is my 20 page of images cool now what I will do is that now let's split this data into training and test like saying that train underscore DF that's going to be equal to take the data frame and write a query and asking the file name and my images underscore train and put the upstring and similarly test underscore DF from the data frame and the file name in images underscore test now execute this and here we go which we have the images underscore train.hand and this is basically the images that are basically used for the training and this is the images underscore I have not had and this on all the images which is basically used for the testing cool so this is how we can split the data in training and testing we are going to convert our object names into specific IDs the reason why we need to convert these object names into the IDS is because in any machine learning or deep learning models even the YOLO is also deep learning model we cannot train a text so train the text is not possible in any model so we need to convert this into some kind of a numerical information that process is called label encoding so what I will do is I'm going to convert that this label into some kind of number information that is called assigning the numbers to the object so here I written one function name it as label encoding here I'm having the 20 labels so therefore each and every label or each and every object name I given one I assign some ID to it for example for person I assign 0 for car as assigned one but share it is 2 and so on like this I assign all the object names to those specific ID and which I created in the dictionary finally what we will do is that I am going to return the ID of that particular object name so that's what we're going to do here so once we have this information the next thing is let's apply this to our both train as well as test set let me take the train underscore DF and this is my ID that's going to be equal to so train underscore name drain underscore D app to which column I need to apply is the name then I need to apply this function so apply label underscore encoding that's the function it applied to this similarly let's also apply to our test DF so test DF of ID equal to test DF of name and we are going to apply the label encoding now let's execute this oh this is my test DF execute it all right now we have successfully did that let's look into our sample data of head of 10. all right here we go you can see that this is what the file name and in this file name we have the name that is the horse and the ID is nine and similarly the person this is zero okay so we have successfully assigned some labels to it so in going forward we're going to use this ID for the training process this is how we can assign numbers to the object name in this lesson we're going to see how to create the fold structure in our data images we are having all the images are there and now what we're going to do is that here we're going to create the two folders name it as strain and test and in the train and test folder we are going to save the training images and the respective label information in the dot txt file so this txt file will contain the object ID and the YOLO label information that is Center X Center X Center y w and H and similarly in the test folder we will save all the test images and the respective labels in dot txt file that's what we're going to do in this lesson now let's go back to our jupyter notebook all right so we are in the jupyter notebook now and here we're going to see how to save the images in the train DF and the test DF and also this Center x w and H and ID information so first what we will do is that let's import few important libraries the first one is import OS okay next we're going to import from shuttle import move okay now let's execute this the next step is let's define the train folder what I mean to say here is that in the trained folder in our folder data images and we need to create the two folders which names train folder and test folder and I need to pass those images into the train and the test folder respectively as per my data frame okay that's what I'm going to do here so first let's define the train folder that's going to be equal to I will create the folder name in the data images and the name of the folder name is string similarly test folder is equal to data underscore images and test now let's create a directory using OS dot make bir and the folder name is the train folder and also let me create one other folder which is make directory of test folder let's execute this and with this we have successfully created the two folder name train and test let's have a look into that you can see that we have the two folder names which is strain and test let's go back to my code all right now the next step is let's also create an object so that we can as easily save our Center X Center y w h and the ID information that's what I want to create in this lesson for that what we will do let me take the train BF and from this train we have I am going to take the file name so this which are whichever is useful for me and next one is the order is the ID and next I want to take the center X Center y w and H let's copy this and paste it here okay this is enter y this is w and this is H dot so what we will do let me put that in a separate list let's say this is my columns and let me create this list now this let me take this columns and what I want to do here is that I need to do the group by by file name all right so this will be my object my group by object for train similarly let's also do it for the test so this is the group by object for the test and this is for my test data and I need to do take the same columns and the group by by file name let's execute it what exactly I'm going to do using this group by operation let me do it for the sample data let's say this is my group by object and here for example let's take this is my file name I said it's 009.jpg and if we look into this file name we have this all information or there means I need to save this all information in one text document and name the text document as 009.jpg that's what I want to do here so let me show you how we will do the sample so actually what we have to do here is that let me take the get group and if you pass the group object that is zero zero nine dot jpg and we will get all this information and this all information I need to save in the dot txt we can use two CSV to convert that into that but how we're going to see now basically I want to if at all you want to remove the file name you can set that set index as my file name now you can simply use this to CSV and let's say this is my sample text sample text document and set the index is equal to false and header is equal to false if you set this and if you execute this now you go back to my home you can see my sample.txt is created if you open that and here you go this is what information which we have which is the something like ID and the system center X and this is my Center Y and this is my w and H information so like this we have to save our information for all the images and put that in the respective folder for example this is the train and it will save that in the train folder that is what exactly we're going to see in this lesson okay so let me comment this now the idea is that let me write down idea I need to save each image and train or test folder and respective label n dot dxt file that's what I want to do here all right so for that what I will do let me create one function and name it as save data the arguments are file name folder path and group object all right now first what I have to do is that I'm going to show you how to move image for example I'm having this image with me and this image is actually there in my data underscore images maybe I can show it to you here and I'm having this let's say 0 0 9 dot jpg this is what the train image I need to move that into the trend folder all right so for that what I will do is that let me use this code so basically I'm using this this move function to do this but let me first look into the syntax of this move function we need to provide the source file and the destination file all right so let's define the source file which is actually in the data images folder and the next one is the file name so for that let me Define this it is voice.path.joint I need to join the folder name which is my data underscore images next is the file name so this is the file name what are the file name I need to from the of argument I need to save that similarly let's define the destination now this destination is basically from the folder path so if I set the trend if I set this is strain I need to train at the train folder so for that what I will do let me take this Define OS dot path dot join and let's define the folder path next is file name okay now we can provide basically the source and destination with this I can basically it will move images to the destination folder that's what going to happen here now at the same time I'm going to save the labels information how exactly I going to save the labels information you already see here the group by object basically this group by object is coming from here now let's take this group object and we will Define this but before that I need to define the name of the of name of the file I need to save basically you need to save with the same name for example the name of the file it is 009.jpg and it will create a text document with the same name that is 0 0 9 but the extinction should be the dot txt all right that's what I'm going to do here so for that what I will do is that let me create the text file name this is basically the text document file name for that what I will do is that I'm going to take this file name and remember this file name is basically contains the extinction something like dot jpg now I need to remove this dot jpg I need to consider that for that I'll use again the OS dot path dot split text simple split text and which will basically separate out the file name and the extension okay you can look into this it's basically split the extinction From the Path name so from the path which is from the file name it will separate it out and the zeroth index will be the name of the file now I'm going to add the extension this is plus dot txt simple concatenate operator with this I can get the file name information now what I will do is that now let me take the group by operation something like this let's copy this and paste it here so here actually I'm going to get the group by object from the group by you need to get the groups and here I'm going to pass the file name and obviously I will set the index as this and next I need to save that with the respective file name nothing but my text file name that's it and then let's make index to false and header to false all right and one more important information I forgot to mention is basically in the sample underscore text we need to save each and every information without commas means basically I need to provide one space between each and every value so for in order to achieve this what we can do is there is a separation is that the set separation by space we can achieve this so This is actually the space okay uh one more thing is that since I need to save my text file into my respective folder path so what I will do here is let me also add OS dot path dot join and I want to join the folder path with my os.path.split text or file name of 0 dot text that's what I need to do here I think everything is good now now let's execute it done the next step is we are going to apply to all the file names whatever we got so first what I will do is that let me take the group by object this is my group by object and from this group by object let me get the groups and this one all the groups what I'm basically having with me and if you consider the keys of this which is a dictionary if I consider the case of this and this and all the file names that we need to apply okay so first what I would do is that let me convert that into the series which is basically the convenient operation for me so convert that PD dot series of this and this is my file name series execute it now let me execute this and we got the file name Series so stay with me now what I'm going to do is that I need to take this file name series and apply this function and this will actually do the operation for the sake of Simplicity what I will do is that let me apply for the head of 5 basically only for the five images or let me apply for only two images studies for 009.jpg and 0 0 12.jpg and if these two are working fine then I'll apply for all the images so let's do this to head off to dot apply and we need to apply the function a save data so this is the save data function I need to apply here remember this save data function has the three arguments one is file name folder path and object but actually the file name is getting from this then rest to arguments is what we're going to provide is that aux the first argument is the folder path the folder path is going to my train folder and the next argument it is a group object that is my group by object train okay now let's execute this and done we have success related this what we will do let me go back to my what we will do let me go back to my folder and let's see in the data images and here you go you can see that zero zero nine Dot jpt n0012.jpg and also we have the dot txt file has been saved here successfully okay that's really cool right so what I'm going to do let me again cut this and put it here and also delete these two okay now go back now apply it for all let's minimize this instead of applying for the head of two let's apply for all okay so this process will take a while please be patient until you complete the whole process now let's execute it all right the process getting started and here you go you can see that my train images has been copied to all these things and simultaneously we're getting the dot txt file please be patient until you finished all the process alright we have successfully finished our process and let's look into our folder and here we go so again the train folder we have all the images and also the respective.exe file which actually store the information about the labels for example if you consider this image that is zero zero six seven two Five Dot jpg and this is my image and we have the respective information is something like this so first indicates the ID and rest so on so basically the 12 ID should be the cat let's check that whether our information is correct or not and 12 should be cat correct so now we have it means it means we have did the right process okay now let's also do it for the test images the same thing copy this and paste it here basically the same thing we will also do for the test so what we will do take the file name series underscore test and the group by object is basically this will be for the test now to get the group by Dot case next we can save the information using the same function this is my file name underscore series dot apply the save data and here we need to provide the test folder and the object which is basically Group by object for that is my test okay now let's execute this and we with that our test images will also get saved into our respective test folder let's see that okay this is my train and you can see that we have test images has also been saved into the test folder something like this let's wait for a few seconds until you finish the process done with that we have successfully split the data in the training and testing images and also save our information in the respective folders successfully so this is my train folder and this is my test folder and we have successfully stored those information successfully all right and if you want you can delete this dot XML file or you can keep in some kind of an arcade folder for the further reference so it's better to keep that in the separate folder let me create a new folder and name that as annotations okay now select all the XML files then simply move that into the annotations folder all right we have successfully put everything into this folder with that we have successfully finished the data preparation now what we're going to do is that in this section we are going to train our model so before training the YOLO model first we need to create the yaml file and in that yaml file we need to specify the train and validation set location also we will specify the number of classes then we specify the object names into that so let's see how to do that so once you're on a Jupiter notebook in the same folder create a new and create a text file and name this text file as data Dot yaml okay now here specify the train location so the train followed by colon and the location it was there in my data images and the train folder that's what we need to provide data images forward slash and train similarly the validation set is there in data underscore images slash test okay so you can see that in the test folder we have all the validation set information the next thing is number of classes NC stands for number of classes and since we are in this project we are dealing with 20 classes and let me Define the names of this names and create a list and you can write down the list of objects that you're having I'm already having the list of objects with me and let me copy and paste it here okay which is person car chair bottle Etc okay all right so this and all the object names that we are going to use to train our YOLO model if at all you are using the custom object make sure you mention the names and a list something like this that's it now let's save this and with this we have successfully created the data.yaml file we have seen the challenges that we generally face while training the Yoda model the problem while dealing with the Iola model is we require the fast gpus but the fast gpus is not everybody's cup of tea that's the reason we are going to use the Google collab which offers a free GPU but the point here is that they will offer the free GPU only for 12 hours we will make use of them effectively since the training or model in GPU hardly will take like two to three hours so we only require the GPU only for two to three hours okay so anyway that is sufficient for our problem if at all you are having the GPU and you wanted the local environment you can follow the same procedure that explain here okay all right so let's get started the first step is for all right let's get started the first what we will see in this lesson is setting up the Google collab for that you have to login to your Gmail account then go to www.google.com and this is the home page of the Google then click on the apps and go to Google Drive and here you go I'm in the Google Drive so here what I have to do is that I want to install the Google collaboratory for that you need to click on new and click on more and click on connect more apps this will connect to the Google workspace Marketplace and in the search type collaboratory and you can find the app that is a collaboratory click on this and click on install click continue it will ask you for the email account to which you want to install click on this and done click ok all right done so now we have successfully installed the Google collaboratory let's open the Google collaboratory and set up our GPU to that first what we will do let me create a folder by right click and click on new folder and name the folder as YOLO training and click on create and go to the folder and inside the folder let me create the collab notebook for that right click click on more and click on Google collaboratory wait for couple of seconds until you finish the process we have successfully created The Notebook that is Untitled zero let me rename this as YOLO training and press enter and go back to my Google Drive and refresh my page you can notice the changes here go now here this is what the Google collaboratory it is much more similar to our jupyter notebook what we have worked previously now let's see how to connect our Google collaboratory to GPU here you can see the runtime and there is a change runtime type by default the hardware accelerator is not none indicates you are connecting to the CPU and Google offers the unlimited time for connecting to the CPU but in order to connect to GPU or CPU Google only offers the 12 hours of the time so you can use this 12 hours of time of GPU while doing this and if at all you want to use the GPU for pull version for unlimited time then you need to upgrade to the collab Pro you can look into the plans that offers by the collateral and Etc but anyway I am using the pre-op charge that is a current plan and I'm going to use this and the GPU for the choala personally let me click on GPU and click on Save now here on the right hand side you can see the connect click on connect all right we have successfully connected GPU in order to notice that there is a ram and disk is there if you put the mouse on this and you can see it is connected to Python 3 Google compute engine backend that is GPU okay so this is how we can connect to the GPU if at all you are not at all using this notebook then what we need to do is go to runtime and click on manage sessions then terminate the session so always whenever you are not using any notebook it is always good to terminate The Notebook because every time you're using that even the ideal time also the GPU is being killed Google will calculate the idle time so if at all you're not using this then just click on terminate The Notebook so that you can save your GPU for the next purpose all right this is how we can set up the Google collaboratory and GPU let's see how to train our YOLO model first and foremost thing I'm going to use the ultra analytics YOLO V5 GitHub first and foremost thing I'm going to use the ultra analytics YOLO V5 code here and they've actually provide the whole bunch of functionalities and everything is ready and setup we just simply need to use this that's what we're going to use and first what we're going to do is that let me close this Ultra analytics YOLO V5 to my Google collab for that let me go back to my Google Drive and you and open YOLO training and here first I just want to mount my Google Drive for that click on files and make sure your GPU is on and then we need to connect to my drive for that you need to click on files and here you can notice that there is a Mount Drive option is there click on Mount Drive and connect to Google Drive it will ask you for the Gmail account click on this wait for a couple of seconds until you finish the process alright so once you mount the drive successfully you can notice that there is one folder you can see that is dry this try is our Google Drive and you can expand this and you can see the folders that is in the my drive you can see the folder name that is yellow training cool right now we have successfully connected this the next step is that I just want to go to this particular path in order to go to the path let's first import OS now OS dot CH vir which is the change directory and I will change the directory to this URL training in order to get this path I need to right click and copy path and paste it here now execute this done now let's type LS and you can see the folders that we have obviously we are in the YOLO training.ipymb notebook is there in this that's correct right once you've done this process next is we have to clone our Ultra analytics your lobby pi for that you can click on this code and have it look into our get URL copy this and go to my collab notebook and type the following things that is exclamator Mark get clone and paste the URL and paste the URL that's the URL you need to paste now execute this wait for a couple of minutes until you finish the full process done we have successfully cloned the following repository into your Google Drive you can go back to my Google Drive and refresh the page and you can notice that one folder name YOLO V5 and inside that you can see all the codes and everything is there cool now the next step is we need to upload our images whatever the images we have and also the yaml file that we have created in the previous lesson so what we will do let's open our current working directory in the local environment which is this what we have is the data images and train and test everything is there and also and also we have the data.yaml file is also there here okay so first let's upload our data images folder here in order to do that what you have to do is right click here inside the YOLO V5 right click and you can see there is a folder upload is there click on folder upload and go to that particular directory and here you can see that we have the data images folder inside that basically we are having the train and test both the folders are there now go back and select this folder and click on upload now upload is started so wait for couple of minutes at least 15 to 20 minutes please be patient while doing this process I have successfully uploaded all the images and it and it took around one hour for me to do this and you can notice that we have all the images in the train folder test folder and we have the XML files in The annotation anyway we are not at all using these annotations we are only interested in the train and test all right so now let me go back and open my YOLO training.ipynnb notebook and make sure since in this past one hour I have not used this YOLO notebook so that's the reason I terminate this notebook now let me rerun this notebook for the first what I have to do is I need to connect to my GPU and as soon as you connect to the GPU and you can notice that automatically your drive is mounted all right so if at all your trick we can see this drive is mounted then you need to mount your drive again okay anyway let me change my directory to the YOLO training so which is my my drive and your training so this is what the drive and make sure that you are in the YOLO training and now you can see that we have the folder YOLO V5 do not run this again this is this process is not required because we already cloned the entire data let me comment this and the next step is what we're going to do is we need to install the requirements so for that what you have to do is we need to go to uh in the YOLO V5 there is a requirements.txt is there so in the requirements.txt we have the additional packages that was required to run the YOLO model so that's what we need to do here so now what I have to do is that let me install the required packages all right since my requirements are there in the ulb file so let me change the directory you can change the directory using again OS dot CH dir and change the directory to YOLO V5 execute this now press LS and make sure that you are in the folder that is yellow LS and you can notice that we have the requirement dot text file is there now let's install the requirements whatever we have so using pep exclamator Mark pip install hyphen R requirements dot text okay now shift enter to execute this it will take a while please be patients while installing the requirements.txt all right I have finished all the installations that was there in the requirements.txt next now we are moving into the training process so basically I am doing the training YOLO V5 model okay so in order to train the YOLO V5 model the first and foremost thing make sure you already imported the data images that we already created and inside that we have a test and the train folder the next thing is we need to have the data.yaml file so data.yaml file is what the profile which we have created and you can see that so this is the yaml file which is actually required so what I'll do is that let me also put that data.t ml file into this you can simply drag and drop this ml file into the particular Google Drive so that it will upload for you so we have successfully did this and go back to my collab notebook and you can refresh that and you can see the necessary changes here Okay now click on drive my drive and YOLO training and YOLO V5 and here you can notice that we have the data.yaml file is there now we are ready to train our YOLO model have the following command to train the model which is Python 3 python train.pi so this is my train dot I am calling this and now I need to provide the yaml file which is data data.yaml file okay so basically in the data.eml file we provide the location of our trained folder and also the test folder then I will need to provide the configuration CFG and provide this YOLO V5 yes Dot yaml all right and finally we'll provide the batch size so double underscore hyphen I find batch size and now provide the batch size to 8. so this is up to us like you know if at all you want to train it quickly then just increase the batch size to 10 or 12 but don't oversize the batch size because this will hamper our entire Google collab so make sure if suppose if you got some kind of a kernel diet then make sure you try to reduce the batch size and check that ideally I found out like you know the battery is equal to 8 is good to train the model in the Google collab okay next we need to provide the folder that way we need to save that is underscore underscore name I am providing the folder called model in this model I need to save all my results and finally provide the epochs underscore underscore epochs let's set the epochs to 50. okay so this will actually take some while around 50 to around three to four hours to get trained set epochs to 50 and make sure set the epochs to 50 and this process will take you around some two to three hours okay so if at all you increase the epochs and the number of duration to train the model will also increase so make sure you just take the less Epoch so that your process is correct all right so be careful while increasing the epochs because you be careful while increasing the Box because you have to be in the because you have because Google collab is pretty smart if you see any kind of idle it automatically shut down or notebook so that's the reason be careful while doing this I am setting this approx to 50. shift enter to execute this and this will start our training process please be patience and wait for at least two to three hours to complete the entire training process and stay along with your notebook so that you can see the entire training process I have successfully completed the training process and it took nearly 2.12 hours to complete the 50 epochs perhaps in order to do the 100 epochs it might take four hours but anyway I have completed the whole process and you can also look into the summary of the results here and we found that nearly the Precision and the recall for all the classes it's on average it is 0.755 is the Precision and the recall is 0.648 and the mean absolute Precision for the confidence score of 0.5 it is 70 percentage and the mean absolute Precision for the conference code 95 percentage it is 0.46 and with that what I can say it is a nearly a good model because achieving such a good accuracy is also really great and also you can look into the individual Precision recall and also mean absolute score for the respective objects is also printed here and what we can see here is that for detecting the person the main absolute position for the 50 confidence score it is 0.851 it's really good whereas for the few objects like share it is 0.44 and so on it is bad but overall you can see that mean absolute position for the confidence score of 0.5 it is 0.7 it's really a good thing and also we saved our results in the runs train and the model folder you can see that whereas in the YOLO and there is one folder called runs inside the runs we have the train and it's a train that is a models and here we have all the results are saved and inside the weight you can see the models which is saved in the pytouch format what we need to do here is that since I just want to load that in the opencv I need to convert that into o n NX format now let us export our model to O and Linux format let me go back to my collab and here let's type the following to convert that into onnx which is python and we need to call the export.pi export.pi is basically this we are actually calling the export.pi and if you open that and here you can see the different formats that we can actually save our model like on X is basically used for the opencv and we can also save that in the tensorflow and so on anyway since I just want to load my model in the opencv so let me save that in the O and NX format okay so the syntax to run that is this we need to basically type the Python and the path of the export file and provide the weights of our model what we have saved and then we will need to include the thought script and onnx so let me copy this basically let me copy this and paste it here obviously the first one should be the explanatory Mark and the path of export.pi is obviously in the same folder I'm working on this so I'll just give the export.pi and here the what the weights I need to give is which is actually in the runs train models weights and there are two weights or basically zel which is best and second one is last I'm going to give the best weights okay let's provide that remember it was there in the runs train and models folder let's provide this the weights is basically runs train models folder and inside that we have the weights and inside the weights folder we have the best DOT PT best DOT PD okay and then include the thought script to O and NX this is basically the format I need to save now let's execute this done we have successfully saved our model and oh and an X format let's look into that and inside that you can see we have the model that is best DOT o n n x cool so this is how we can convert our model into onnx in the next lesson we're going to use this but what we have to do in this lesson is first let's save this into our local directory because in the further lessons we are going to use this only so basically I am going to save the models folder where I'm having all the results let's go back what you have to do right click and click on download and navigate to our current working directory in the data where we have the data preparation folder let me create one more folder and name that as predict2 predictions okay let's save this all right we have successfully saved that and you can notice that we have the predictions folder and inside that this is what we have saved let me extract this right click and extract here that's it we have successfully extracted all these things let me check and here we go we have all the results and inside the weight we have the best DOT o n NX file which is actually we're going to use in the next lessons cool right all right since we have trained our model in the yellow training and what you have to do is always it is a good practice once you have finished your notebooks you have to terminate this notebook so that you can utilize this time for the next models okay in order to terminate this click on runtime manage sessions and click on terminate okay and safely you can close this window and also close this window okay now once you back into this in the YOLO training you have this model and the no kernels are running and now you can use this time whenever you're running any other models okay in the previous lesson we have seen how to save our models folder inside the models folder we have all the validation results and in the weights folder we have the model that is saved let's look into our validation results whatever we have the first first let's look into these images these images is basically the images which actually shows that how the label is done it literally shows that okay this is the bounding box for the ID number 16 and so on okay and this is also the representation of the batch one results and this is the batch 2 results we majorly interested in these results this is actually the validation results by the YOLO model so it which means that this picture is for the bad 0 and this is the labels and this is for the batch zero the predictions let's look into that so this is actual actual results that we labeled so we manually label that which is what the actual results whereas this one is the results that was predicted by the model so let me put side by side right so this one is this one is the actual values and this is the predictor now if you zoom this and look into the results the first one is strain and our model is also predicted it's a strain and The Confidence Code is 0.9 and also in the next picture we have the bunch of persons are there and it seems that our model also detected those persons and even the bottle is also detected this is my original and this is actually the model predicted okay and this is the cat it seems that we have the so many cats out there but the model predicted only two cats here Also let's look into the sofa and model unable to do that but this is the prediction and the next thing we can see that we have successfully detected the and this is basically the original one where we have the actual labels which is car and few persons are detector and here also the car is there and the few cars also detected here cool right so that is what the information what we have here this is the label and this one is the predictor okay and this is for the batch one results for the different results the and this is the prediction this is the batch One True Values and this is the predicted which is predicted by the model it seems that even our YOLO model it seems to be pretty accurate there are still slight mismatches are there but what you can say is that it is really excellent prediction is what we can say here cool so this is what about these pictures and there is one more image is there that is the confusion Matrix which is actually very important it shows us so what is the Precision rate for each and every label since we have the 20 images out there and it gives you the Precision score for each label per person it is 0.84 per car it is 0.84 for chair it is 0.46 but bottle it is 0.5 and so on so that's what it represents and from this conversion Matrix itself we can say that our model is obviously good and whereas it is slightly low in these regions for Jade bottle uh potted plant bird etc for this the prediction is a bit low whereas for the rest all things the predictions are nice that's what information we have in the confusion Matrix all right and these three curves will basically represents the Precision and recall curves so this is the Precision curve what it says is that the exact is a confidence core and y-axis is the Precision and as the confidence score increases your Precision rate is also increases and this this one is the pre recall the x-axis is the confidence y-axis is the recall more importantly this is most important for us this is the Precision and recall where it actually balance and here you can see that like the red line is which it actually represents the Precision recall curve and what we can see here is that on an average from here you can see that on an average we have the nearly more than 70 percentage of precision and the recall is also coming under the 70 percentage of recall that's what which we can see in this curve all right so that's how we can understand about the curves and the valuation all right and the one more important information what we have is the results dot PNG what this results dot PNG represents the loss across the each and every ebook the first picture if you see here what it means is that for the 50 epochs this is what the loss is falling down and which actually indicates that if you run for more epochs probably for 100 perks it seems to be our training loss also comes down also you can look into our validation loss simultaneously it is also decreasing as our epochs are increasing so it says that just run your model for more and more Epoch so that you can get the better accuracy that's what it says and all these things are saying like you know the same thing for the object class is also the same thing but validation it seems to find I think uh only for this picture it is fine whereas the classification loss as the epoch increases or classification training classification loss is decreasing whereas the validation classification loss is also decreasing and when it comes to the Precision it seems to be that as you train for more and more epochs it seems that our precision and recall both seems to be increasing one thing what you can understand from this results is that there is still scope of improving a model Precision by training our model with more epochs okay but anyway I will stop with 50 epochs and try to test with this because I'm satisfied with the results if at all you want to get more and more Precision you can run for more epochs let's say 100 epochs alright let's close this and that's what information what we have in this model evaluation we have successfully trained our YOLO model in Google collab and save that in our desktop so here I have the models and everything that is saved in the models folder and weights where exactly we are having all the models but actually I'm going to use only the best DOT or nnx model only because that is a model which actually we can able to load in opencv okay and the next thing is that I'm also going to test my predictions with the YOLO model with two things one is with a straight underscore image and second one is with this video let me open the straight underscore image and this is a straight image what we have is we can see the bunch of objects like persons cars buses Etc so let's first test with this image and let's see how our YOLO is going to detect to this image next we also test our model to this video this video is actually contains lots of objects like persons bicycles cars and also the bicycles Etc and we're going to see to this video how accurately we can able to work with this video alright that's what we're going to see in this section okay let me close this now what I'm going to do here is that let me open the Jupiter notebook here first let's activate our virtual environment by opening my command prompt type CMD now tap the following command which will activate the virtual environment which is name of the virtual environment YOLO VNV scripts and activate next is I need to navigate to my predictions folder okay so CD to underscore predictions now we are in the prediction folder and here let's open our jupyter notebook all right this is my Jupiter notebook it's open here okay so first let's create a new python 3. and name this as YOLO predictions all right now notebook is ready now let's import the necessary libraries obviously the first thing what we require is opencme and let's also import numpy as NP and import OS next also import yaml file because the yaml is one thing we are going to use so that's the reason I'm using the yaml from yaml let me from Yama dot loader import save loader executed oh I think we haven't installed the yaml let me install the yaml here which is exclamatory Mark pep install pi yaml so make sure that this and all the cases choose pip install Pi yaml now execute it all right now we have successfully installed the yaml let me comment this so you can comment it you just using the this now you can comment this and put it let me do one thing create views and toggle to Bar let me put above this now here let's execute this and now we have successfully imported the yaml cool all right okay in predictions basically we are doing the four important things first one is we need to load the yaml and also our YOLO model so basically we need to load yaml file and also I need to load YOLO model okay so this is my basically the first step the next step is I'm going to do is I'm going to load the image means basically I will take one image and from that image I will get the detections so here we will load the image and get the YOLO predictions from the image so basically we are passing that image to our YOLO model and get some detections or predictions all right and after that we going to do the non maximum suppression so that is a very important step we are going to do the non maximum suppression so since we are going to get the multiple bounding boxes in order to make sure that our bonding box is correct we are going to use the non-maximum suppression filter then our final step is going to be draw the bounding box that's going to be our final step so these are the four important steps that we're going to do in this section without further Ado let's get started first step is I'm going to load the yaml file remember my information completely there in the data.yaml and from this data.ml I want to take this names for that I'm going to use the yaml and so first how to load that with open and we need to provide the name of the file that is data Dot yaml and the mode of operation is going to be read as f now data underscore yaml is going to be equal to yaml dot load I'm going to pass F and the loader you just need to provide a safe loader here okay from this we are going to get the information that was there in the data yaml and from here we going to call the names that's what exactly we require here we name it as labels is going to be the data underscore yaml and call the key which is my names okay that's it let's print our labels now let's execute this and here you go these are all the labels which we train our YOLO model in the next lesson we're going to see how to load our Yola model using opencv we have successfully loaded the EML file now let me also load YOLO model okay let me cut this and insert one more cell and here let me load my YOLO model okay so in order to load the YOLO model I'm going to use the opencv DNN module okay let me load the YOLO model all right now let me Define the YOLO which is my name of the model which is the variable I'm going to use to load the YOLO model which is CV to dot DNN module dot read net from o n n x we have to use this o and NX because we have saved our model in o n NX format that's the reason we have to use this now navigate to the model whatever we have since our model is there in the models folder and weights folder and after that we have the best DOT o n n x so this is what the model that we are going to use now all right and the next thing is that we need to set the back end basically since we have train or mod in a GPU environment we need to specifically say that we have to use a Target CPU because we are using the CPU that the reason for you to set our compute engine to CPU let me set that let's say YOLO dot set preferable backend you have to set the preferable backend as cb2.bn.d NN this is the DNN I have to use opencv DNN backend should be the opencv now let me also set yellow Dot set preferable backend now we need to set the compute engine set preferable Target set preferable Target and here I need to set the CPU cb2.bn.dnn underscore Target to be CPU if at all you are having the GPU environment with you you can select the coda but in my case since I'm having with CPU with me I'm selecting the CPU now execute this done now let's load an image and get the predictions from the YOLO that's what we are going to see now okay the image that I'm going to use here is the street image.jpg this is what the image that I'm going to use now so first let's load an image which is IMG equal to C2 Dot Emirate and I'm reading the image and just set the path of the image which is Street image dot jpg all right the next thing is I will copy my image to a variable image dot copy okay now let's display this image in order to display that we know how to displace which is CB to dot IM show and this is my image and the variable name is image htcb2 dot weight key of 0 and CB2 dot let's try all windows now execute this and this will opens my image so this is what the image which we are working now let me close this anyway I don't want to use this currently now let me comment it or you can also remove this all right so once we got the image and once we understand which is the image now let's calculate the rows and the columns from the image the rows basically defines the height of the image and columns defines the width of the image and also we can get that get depth from the image dot shape so if you look into the shape you will get the rows nothing but height of the image and columns nothing but width of the image and D is obviously will be the three since it is the color image since it is an RGB image okay so that's what the information that we required for this so before start doing the YOLO predictions first we need to convert our image this image into the square Matrix so this is possible like like you know your image could be in the rectangle shape but we need to convert our image into the square Matrix one way of doing that is that create a dummy Matrix and on that Matrix let's overlay whatever the image we are having for that we need to define a square Matrix whichever having the max rows and Max columns so first let's take the max whichever is maximum from the rows or columns which is called Max RC which I'm going to get from the row comma column so from this I can able to get whichever is maximum now once I got the max RC let me create a new input image this is what an input image and this image it is a simple image where all the values should be filled with zeros you can fill with zeros using NP dot zeros and the shape of an array should be the max of RC comma Max of RC now we can get the square Matrix and comma 3 which is a three dimensional array now so we need to specify the data type should be the unsigned 8-bit integer which actually indicates an image unsign 8-bit integer is basically the image so we have the maximum values between 0 to 255. okay so that's what actually we are having if you can look into this we can see like you know cb2.m show and this is my input image and the array is input image now CB2 dot weight K of 0 and CB2 Dot let's try all windows now let's see this and this is simple complete pitch black image which what we're creating and moreover it is a square Matrix all right now on this black image we need to Overlay our original image we can simply do that by just using the accessing you can take the input image that and call the rows and columns so from 0 to this many rows and 0 to this many columns and we need to fill this with this image now let's execute this you can see that we basically having the rectangular Matrix and filling those here and no you can notice that so this is first rows we fill with the image and the rest will be the empty and making this as a squared Matrix that actually we require for the YOLO model okay done so this is how we can able to convert our image into the square Matrix so let me write the steps is the step number one is convert image into Square image now the step two is going to be just pass this Square array to the YOLO model so that's my step two is we need to get predictions from Square array okay so in order to get predictions from this first we need to find out the blob from image The Blob is going to be equal to CV to dot b n and Dot blob from image and you need to provide the input image which is this and now you need to provide the square Factor you have to normalize that with 255 which is 1 by 255 and the next thing is that we need to apply the size remember that actually we have trained our Viola model with input width of with input width and input height of 640 by 640 so let me Define input width for the YOLO model is actually 640. let's say input bit H width and height of the YOLO model is 640. let me provide this and this is my input width height and also input width height of your load okay and next is since we are reading an image in the BGR format now we need to swap this RP swap RP you need to set this flag to true and also crop to false the next thing is we need to take this YOLO model which is my YOLO model and from this YOLO model we need to set the input as block now we need to pass this block to the neural network so that we can get the output which is the threats is going to be equal to take the YOLO dot forward and done which we're going to get the detections or predictions from your model okay now let's execute this let me print s okay this is what we got and it seems to be a very big array right let me look into the shape of this and what we have here is 1 comma 25 200 rows and 25 columns what it actually means is what we're going to see in the next layer here you go this is what the output and here you go this is what the output we got from the YOLO which is one comma we have 25 200 rows and 25 columns this is basically the number of bounding boxes that is detected by the YOLO model and for each and every bounding box those information we have in the 25 columns this successfully got the detections from the YOLO and we got a very big metric something like this basically we got the 25 200 rows and also we got 25 columns this 25 200 rows are the number of bounding boxes what we got from the image and whereas the 25 columns represents this basically these 25 columns can be split into two parts the first part is we have the five columns which are first five columns of Center X Center y w and H these four values are basically the labels what we actually did in the previous lesson Center X and Center y are the center point of the bounding box which is normalized to width and height of the image respectively whereas W and H is the width and height of the bounding box which is normalized to width and height of the image respectively in the next one is a confidence score of detecting the bounding box this is also very important thing so these first five columns basically represents some basic info of the bounding mode and the next 20 columns are basically represents the classification score of each and every class for example we consider one class and what is the probability that it belongs to that particular class so that information which we actually have next 20 columns y20 because we train our model with the 20 objects that's the reason we got 20. for example if you train your model with a hundred different objects you are going to get the 100 columns 100 additional columns along with five okay that's a reason we got 25 columns so this is what actually we got from the YOLO model we got the detections that is 25 200 detections and 25 columns we understand that these 25 columns basically represents the information like the bounding box information center like Center y w and H and the confidence score of detecting the bounding box and the rest 20 columns is basically the object names okay and here the number of rows we are having is 25 200 rows are the number of detection and remember one thing we are actually having the duplicate detections are there that's the reason we are going to do the non maximum suppression that is our next step so we will apply the non-maximum suppression and remove all the duplicate detections and not only while removing the duplicate detections we will also select only specific bounding box which are having some good confidence score and good probability scope that's what exactly we're going to do now so in the non-maximum separations obviously the prime goal in the non maximum suppression is to remove the duplicate bounding boxes so for that what we will do is that we will filter the data first by confidence score and then by the probability and after that we will apply the non-maximum suppression that is a method which is directly available in opencv all right that's what we will do now the step one here we're going to do is from this data whatever we are having from this data we will filter detections based on confidence go so let me put the confidence threshold to be 0.4 and probability scope so let me set the probability score threshold to be 0.25 so it's up to you like you know if at all you want to get more and more bounding boxes then you can reduce the confidence score of the probability score suppose you want to get the more accurate thing then just increase The Confidence Code maybe to 60 or 70. all right and while filtering out this process we will take out the Value Center X Center y w and H and also reconvert that into the original values like x-min x max volume in and Y Max that also we're going to see in here all right so here what we're going to do is that I'm going to create a three list which I'm going to extract the information the first I'm going to extract this bounding box information next I'll extract the confidence score and also I'll extract the probability scores that's what I'm going to do here so anyway here you can see we have the pretz which is actually of shape of 1 comma twenty five thousand two hundred comma 25 so what I will do here is that let me uh flatten this the one of the way of flatten it is taking the first row so here the step one is we will take the detections is equal to fret of 0 and what we're going to do is that we're just converting that into rows and columns you can look into the shape of this and which we have the 25 200 rows and 25 which is a convenient way to do the operations okay now here basically I just want to extract the boxes boxes means we this is the information center X Center y with an h and this is my empty list and let me also take the confidences and that's going to be my Another Empty list and the classes is going to be another empty list so those three information I am going to check here okay now the next step is we're going to take this input image and from this input image I am going to calculate the X Factor means the factor by which I just want to multiply this so that I can get the bounding box information for this info for this image okay for that what I will do is let me calculate the width and height of the image basically this is my input image so let me take image underscore W comma and image underscore H is going to be equal to let me take this input image and from this image let's take the shape only two so where we're going to get the width and height of the image next thing is we are going to calculate the X Factor it means the factor by which I want to multiply my bounding box information which is my X Factor is going to be equal to input image divided by the width of the YOLO model which we have trained this is a width input width and height okay and similarly y Factor is equal to image underscore h divided by input width and height usually these two values are same since it is the squared Matrix usually X Factor and Y Factor will be same okay now what we're going to do is that we are going to filter out the detections first is by confidence score and second is by probability okay basically we are having the 25 200 actions so let me write a loop for I and range of length of detections so we have the 25 200 detections and from these detections let me calculate the row means we will take each and every row row is equal to detections of I okay so once we got the detection of I which indicates that we will get entire this row now from this row we are going to take the confidence right the confidence is there in this position so we can count that it's 0 1 2 3 and 4. so the fourth value is going to be my confidence so from this Row the fourth value is going to be my confidence so this is actually the confidence of detecting a bounding an object okay that's what we have now let me set the threshold like you know we will do the first filtration if confidence is greater than 0.4 then only you have to select the bounding box that's what we're going to do here okay now once we got this confidence score greater than this next we're going to take out all the probability scores or class scores so class core is going to be from this row I need to take all the values starting from here which will be 0 1 2 3 4 and from the 5 till end all or my probability scores Pi colon so that's what the Phi colon I'm going to get the list of values and all will be my probabilities course and from this I am going to take the maximum probability that's what I'm going to do so here I'm going to take the maximum probability of object of the detected maximum probability of the object so for we have the 20 objects from these 20 objects maximum probability from 20 objects so from 20 objects we are going to take the maximum and at the same time we will also take the position at which we got the maximum probability that's also we're going to take this so for that what I'm going to do is that that's a class ID is going to be equal to let me take the rows starting from Phi till end now if you do the org Max and this will give you the get the index position at which maximum probability occurs that's going to be my class huh that's what we're going to do here and once I got this and me actually make sure that the class code should be greater than some threshold let's say 2.25 so one more filter if classical is greater than 0.25 then only you need to enter the slope now you can take the information like CX Cy and width and height of the particular object okay so that's going to be equal to CX comma Cy comma width comma height and those information we are going to take here which is row basically this and all the first four rows will actually gives you this information cool hope you understand how we are doing the step-by-step process so that you are filtering out and also taking out the information now the next step is from this values whatever we got we actually need to construct our bounding box so construct bounding box from four values so here the easiest way is we can able to get the left left and top position and once you have the width and height of the bounding box information which you'll have and that is very easy to construct a bounding box so let's calculate the left is going to be equal to integer of CX minus 0.5 into w and we need to multiply this with X Factor okay similarly the top position is going to be integer of c y minus 0.5 into height and you need to multiply with Y Factor and we can able to get the top position of the bounding box and the width is pretty straightforward the width of the bounding box is pretty straightforward with this integer of w into X Factor and similarly height is actually equal to integer of H into y Factor that's it so with this we got the left top width and height of the bounding box now let's save this information in the variable box and convert that into array which is what we got here is left top width and height all right that's what we have now let's append everything into our list okay so append values into the list first one let's append the confidences means basically we'll append The Confidence Code whatever we got dot append and the confidences is basically the contents next we'll append the boxes is going to each dot append and boxes is this and also append the classes which is my empty list dot append and the classes or class ID so for one iteration we're going to take one so for each and every iteration we're going to extract each and every class ID into this done with this we have successfully extracted all the information all right now let's execute this oh we got an error let's clean this this is rows which is not rows this is row that is basically the row now execute this and done now let's see the values one by one we have the confidences and this and all the confidences of the objects whatever we have and which are actually greater than 0.4 and also that they respective bounding box informations or this next is we can also have the class IDs which are this huh so this is how we can able to do this but problem is that there are duplicate values out there now we have to do the non maximum suppression to this first let's clean all the things whatever we got so this is my cleaning converting everything into the same data type so boxes in converting everything into the numpy box is underscore numpy is np.ra and this is my boxes and again convert that to list next also take the confidences underscore NP means I'm going to use this numpy array first I'll convert that into array of conferences and now again convert back to list okay so that's what information we actually required one is the confidences and second one is this now let us apply the non-maximum suppression from the opencv so which is actually c v to dot DNA Dot nms which is basically the non maximum suppression and once you look into that we need to provide the bounding boxes and scores the scores are basically my confidence codes next we need to provide the threshold scores and the non-maximum threshold score that we need to provide here the bounding boxes let me provide all the information the first one is bounding boxes and next one is the confidence course in the list and the next thing is I need to provide the threshold which you need to consider is 0.25 and the non-maxal threshold limit is 0.45 done and now convert that into the list or flat that's what we will have and let's save that name it as index so basically what we have here is that which is actually the index positions that we actually need to consider is what we'll have in this index positions here now let's do the flatten of this so that we can get the flatten array cool now let's execute this and here we go this is what the indexes we need to construct out of 25 200 rows we actually need to consider these values means out of 25 200 rows we actually need to find out these detections means from this image basically we found out this many objects let's see the length of that which is 24. so from this image we found that 24 objects which are having the good confidence score and the good probability cool right now what we're going to do is that what are the index positions which we got and from this index positions let's draw the bounding boxes what I meant to say here is that we got the indexes from this so this and all the actually the rows that we need to consider so for that what we're going to do here is that let me write a loop for end and index so we are basically considering the index let me print this index so this and all the index positions and from this index positions we need to extract this bounding boxes confidences and the classes information all right let's see one by one all right the first one is the bounding box SNP first let's see extract bounding boxes so here we know that the bounding box position what we have is left top width and height let me take this so which is my X comma y comma W comma H which is actually equal to we need to take the boxes NP and from this we need to take the index position so with that we can get the XY positions of this next thing is a week we need to take the confidences so let me take the confidences so this is let's say this is bbcon and which is equal to from the confidence as NP and we need to pass the index and if at all you want the classes obviously we want the classes so this is my classes ID and that's going to be to use from this classes you need to take the index and in since this class ID is basically we will get in some kind of numbers in order to get the exact names we are going to use the labels whatever we created from whatever we extracted from the yaml file so let me take this labels and here the class name is going to be equal to from this labels and obviously this is an also sequence I need to pass this class society and we can get the class name to this cool that's what we are having here now let me create some text and which is actually I need to display that on the top of the image and that's going to be equal to so I will dynamically generate this let's say let me create the app string and the first I'm going to define the class name whatever the class name it is and next I'm going to display the confidence score of this so which is BB bounding box so this is my confidence score what exactly I can display here and if at all you want to um since this value is basically in the float values if at all you want to convert that into integer what you can do here is you can check this value and multiply by 100 and convert that into the integer and that is also fine okay and that's what exactly we are going to have here all right so that's what here and this is the text I'm going to generate here let me print this print text all right here we go which we have first we detected is bus and the conference code is 92 percentage and the car it is 92 percentage and so on cool now let me also draw the bounding box which is rectangle which we can do by civita dot rectangle and here to which image I want to do here is that at the top we have the image copying let me copy this and paste it here to this image we'll do and what we have here is that first two or the left and top which is my X gamma y now let's give the another diagonal position which is X plus W plus y plus h and with that I can able to get this and let's define the color of the bounding box let me take the green at the moment which is 0 comma 255 comma 0. and this is my two all right now the next thing is let me put the text so put text and to which image I want to put this image and next one is the text I'm going to put is this text next one is the position and the position I just want to put here is let's say x comma y and I just want to move a bit up which is y minus 10. now the font scale which is CV2 Dot provide the font let's say plane next is one scale to be 0.7 and and the color of the text let's say black color zero comma zero comma zero will we give the black color and one all right that's what we have so let me do one thing since this is the black color let me uh create some kind of rectangle box on top of this and we're going to display that which is civita Dot say rectangle and to this image I will draw the rectangle and the position is X comma Y and I want to move this up to the 30 pixels and this should be the X plus W comma y and the color are going to give the white color which is 245 comma 255 comma 255 and -1 will fill all the rectangle box okay done now execute this done we have successfully execute that now let me display our image which is civita dot in show and this is my YOLO predictions and to which image we did is this image now CV2 dot in show Let me give the original image which is my IMG and provide the CV2 dot weight key and by default it is 0 and CV2 dot let's try all windows execute it and done so this is my original image what we have and the next thing this is my bounding box detected image and this is really cool you can see that this is the person detected car is correct and person is detected and one more person is detector and bus that's correct car persons and so on so we are actually detecting as much as possible that is really cool thing what we got from this image that is really cool thing what we got from the YOLO model we have successfully did the predictions from the YOLO model but the problem is that we got very big long core so this code is very difficult for us to do in the further cases so what we will do here is that let me create the function out of this for that what I will do let me save this as dot Pi file and in that Pi file let's do some modifications so here click on file and download as python file this is dot Pi file and just download this and keep it all right I have downloaded successfully and make sure that you keep that Pi file whatever you did in your current working directory I have already downloaded into my current working directory and you can notice that we have the YOLO predictions dot Pi file is there let's open this and here we have lots of code is there and let's put one by one and put that in a function first let's remove all the unnecessary codes are there this is all the things we need to remove and of course we don't whichever is not required just remove that the first thing is basically we have having the input functions and next one is we are loading the yaml file all right and this is required and we have the labels and print statement is not at all required The Next Step what we found here is the loading a model okay and after that we have a long code in which you're actually doing the predictions all right so what we will do here is that first let me remove all the necessary things this A7 is not required and this is not required and this is the long code just remove all the unnecessary steps whatever we did something like print statements Etc and be very very careful while removing this let's remove this and finally we have the AIM show okay so these and all steps are not required let me remove all distance okay so I'm having nearly correct exactly the hundred lines now here what I want to do here is that basically since this is a load function and we have the loading and image is there generally the user perspective it's better to pass an argument from there we will load in Yolo model that's what we need to do here so for the simple City what I will do is I am going to create a class and creating a class is one of the most useful ways so that it is really very very useful because we need to pass the yaml file and we need to pass the model and we need to pass the basically an image so these are three things we need to pass this so it is much more convenient to create a class so that it is easy for us first let's define a class here and this class limit as YOLO predictions or yellow bread okay and now Creator init DF underscore underscore init underscore underscore since these all are the class variables so it is better to create a self for it so the self dot labels is equal to data yaml of this next also create a cell for YOLO and sell for YOLO and also self dot YOLO and these are the class variables which we are going to use throughout the class done now the next step is going to be my predictions right so what I'm going to do here is that let me create a function and name this as predictions and first we draw the self and then the argument first argument is this image let's say I already loaded the image outside the function and passing as an argument that's what I'm going to do here and done okay once we pass this and this will actually take care of everything so here we don't require the load things and so on and now we need to select all these things and press Tab and one more Tab and which will maintain the indentation okay all right so we have everything ready and make sure that wherever you're calling this YOLO you need to replace that itself okay so let me uh walk through the code one by one so first we are calculating the shape and after that we are calculating the max rows and Max columns and after that you are defining the input width of YOLO and so on and here we are passing to the yellow so this should be my self dot YOLO because this is my class variable and also self.ulo dot forward be extremely cautious while doing this step because this is self taught YOLO not YOLO if you're getting error like something like YOLO is not defined please do this like self dot YOLO all right now the next step is we got the detections and we Define the empty boxes and everything is done and we Define the image width and image fight then we got the X Factor and Y factor and everything is same and also we got the classes detections everything is done up to here it is fine and what we need to change here is labels so while drawing the bounding box we Define the class name this label is actually the class variable and we need to define the self to it create self to it now let's save this and everything is done and and finally what we need to do is we need to return image now let's save this okay with that we have successfully created the class next thing is as I said I'm going to create the random I'm going to create the different colors for the different objects so let me generate the colors let me create one more function and name that as generate colors all right now the input argument is going to be the ID this ID is actually my class ID so it depends upon the class ID I want to generate the color and remember the color should not change so that's what I need to make sure here and obviously this is a function I need to define the self to this so self comma ID okay and since in order to not to change the colors every time you execute I need to define the seed to it so NP dot random dot seat of 10 you can give whatever number you want now I can able to get the same color every time I execute this all right and the number of colors I want to create is actually the number of classes and that number of classes information I can get from the yaml file in the data underscore DML we have the NC and this will gives you the number of classes all right so let me Define self Dot and see that's going to be equal to data underscore yaml of NC and this will give you the number of classes information huh oh since this is an init function I need to define the self here this is a class and you define the self here okay is there any mistakes no all right now let's go down and here let me Define the colors so the colors is going to be equal to I'm going to generate the random way numbers NP dot random dot Rand int and I'm going to generate the number of classes by three Matrix and the minimum value should be 100 and the maximum is 255. since we know that the maximum value for image is 255 and that's when we change this next is size is going to be equal to basically it is the number of classes so self dot NC comma 3 because it is RGB done now convert to list and finally we can return the colors of ID so one based upon the ID it will give you that particular color and I will convert that into Tuple of this okay so with that I can able to get the generate colors now let's call this gendered colors immediately after my class name which is my colors equal to and call the generate colors since this is my inside the class I need to call with the self dot generate colors and I need to provide the class ID to this once you provide the class ID to this we can get the particular color based upon the class and this is my color I need to provide to my rectangle function this is just replace this okay since it is white color I'll also replace this and anyway uh the text color I want to put into the black let me Define the black color now let's save this and with this we have successfully created our predictions dot Pi file so where we have the class and everything is there make sure there is no other thing is there below this please check once again and you can take this and done if at all you are getting any errors you can find this file YOLO underscore prediction dot Pi file in the resources and compare side by side so that you can avoid all the errors all right so this is very very important step that we're going to use and this file is going to use in the further lessons all right with this we have successfully completed the YOLO predictions dot Pi function all right we have successfully created the YOLO underscore predictions dot Pi function you can find this you will underscore.com predictions in the resources all right let me make use of this YOLO underscore predictions dot pi and let's do the predictions to Street image.jpg let me create a new python 3. and name this notebook as let's say 0 to this is my predictions right so let's import the require libraries that is only opencv which is import CB2 and the next one is what we need to do is a YOLO predictions dot Pi that's what we need to import from YOLO predictions we need to import yellow bread let's execute this done now let me initialize the YOLO print and name it as yellow that's going to be equal to Yellow bread and here we need to provide the two things one is uo nnx model and second one is a data yaml file and the path of the row and the next model which is in the models folder and weights and this is my best DOT o n NX next we need to provide the data underscore EML file the location of this is this since it is there in the same folder and just simply providing this now let's execute this done and the next step is let's load an image and pass it to this let's do that and here what we are having is that let me load an image IMG is equal to cb2. M read and the name of the image I want to read here is Street image.jpg okay now in order to display this is which is C beta dot m show and this is my IMG and next is this is IMG now this is my CV2 dot weight K of 0 let's try all windows let's execute this and this is my image press escape to close the window all right now let's do the predictions okay so the predictions which is my IMG underscore threat which is equal to YOLO dot we have the predictions and to this predictions we simply need to pass the image which is IMG and execute it done we have successfully executed this let me copy this and paste it here and here I just want to put the predictions and this is my image underscore thread directions image now let's execute this and here we go you can see that we have the different colors for the different objects and also we successfully predicted all the objects we have the persons cars we have persons cars and bus Etc that's really cool right this is how we can make use of this function let me also test this to a video what we have is the video.mp4 let me also test to this video all right so this is going to be my real-time object detection so let me make the heading the time object detection okay so first let me create a video let me that is cap is equal to CV to dot video capture and here we need to provide the path of the video which is my video dot MP4 video.np4 now create a while loop which is while true and red comma frame is equal to cap dot print if we are unable to read the video then we need to break the loop so F rat equal to equal to false this is my false then I need to say like you know print enable to read video all right then I need to break the loop if it is if you can able to read a video then what you have to do is we need to get the predictions which is production which is red image is going to be equal to from YOLO dot predictions and we need to pass the frame all right now this is going to be equal to CB to download now let's show the image which is civita dot IM show and here this is my YOLO and what I want to do is that I just simply need to pass the red image okay now let's say if CO2 dot weight key and just give the one second delay and if you press Escape which is nothing but 27 then I need to break the loop now sub to dot destroy all windows and also cap dot release now execute it here we go this is what the predictions what we have this is really cool and awesome what you can see is that we can really able to detect the person's people in real time that's really super cool and awesome right let's look into our predictions and enjoy the video awesome right this is how we can do the real time object detection I hope you really enjoy this entire lecture series

Info

Channel: simple learn

Views: 62,421

Rating: undefined out of 5

Keywords: yolo algorithm for object detection, yolo algorithm explained, yolo computer vision, yolo algorithm, yolo algorithm deep learning, yolo algorithm python, yolo deep learning python, yolo deep learning, yolo object detection, what is yolo, yolo algorithm implementation, yolo tutorial, yolo custom object detection, object detection deep learning, yolo object detection python, yolo python, how yolo works, yolo object detection opencv python, yolo object detection demo

Id: mRhQmRm_egc

Channel Id: undefined

Length: 175min 50sec (10550 seconds)

Published: Thu Apr 13 2023