YOLO v3 EASY METHOD | OpenCV Python p.2

Video Statistics and Information

Video

Captions Word Cloud

Reddit Comments

Captions

hey everyone welcome to my channel this is part 2 of the Yolo version 3 implementation in the previous video we set up our camera and reported the class names in this video we will set up our network and feed it with our webcam image to get the output as a rebounder boxes so let's get started so in the previous video we imported our libraries and then we initialized our webcam to get our image and then we extracted the koko names from the Coco names file and now we have a list of all these names that we can use once we find our detection today we are going to start off with importing our configuration file and the weight file so that we can create our network so we are going to write here model configuration is equals to Yolo is it version 3 and then it says 320 and then it says dot c3 and then we are going to write model weights is equals to Yolo version 3 again and then we will write 320 and then we will write weights then we are going to create our network so we will write here net is equals to C v2 dot the N n dot ridnitz and we are going to read from darknets and it requires two things the configuration file so we will write model configuration and then model weights so once we have done that now we have to declare that we are going to use OpenCV as the back end and we want to use CPU so here we will write Nets dot set preferable back-end and we will write here CB 2 dot d n n dot d n N and then we will write underscore back-end OpenCV then we will write here net dot set preferable targets and we will write CV 2 dots DN n dot the an underscore target CPU okay so let's just run this and see if it imports properly so there we have it and we don't have any issues so the next step is to actually input our image to the network now we cannot just simply input the plain image that we're getting from our webcam into our network the network only accepts a particular type of format and this format is basically blob so we have to convert this image to block so we have a very simple function let's let's call this blob and then we can write here C v2 dot DN n dot blob from image so this will convert the image to blob and then we have to send in our image then we will divide it by 255 and then we are going to define the width and the height so let's let's declare a parameter over here so we will call this WT w HT is equals to 320 this is basically the width and the height and this T I write for target so so we are using 320 here that's why we are using the size of three Tony so we will write here because width and height both of them are same so we are using just one parameter so with height and then we're tight and then we are going to right here for the mean we will write zero zero zero now these are different parameters we are keeping them all at default if you want to know further detail about these I have put a link in the blog to the documentation so then we are going to write cropped is equals to false so then we can set this blob as an input to our network so we will write here net dot sets input and we will write here blob blob okay so now that we have set our inputs what are we requiring as our output to understand this we have to look at the architecture of our network now here is the brief of the architecture here you can see we have a lot of convolutional layers and then here you can see that we have three different outputs so we have prediction one prediction two and prediction three now this means that we have three different outputs so we will have three different values coming out from our network now in order to find the output of these output layers we need to know the names of these layers so that we can refer to them in our network so to get the names what we will do is we will write here layer names is equals to net dot get layer names so this will give us all the names of our layer so we can simply print it out to see what it shows so layer names and then we can run this and there we have it so you can see that we have the names of all our layers so what we have to do next is we have to extract only the output layers so to do that what we can do we can write here net dot gets unconnected out layers and that is it so we can just print this out I think the Spelling's are wrong connected this should be doubling and then we can write here prints and that should be enough so let's comment this and then we can run it okay so now we are having an output but the strange thing is we are not getting the names we are getting the index of these outputs and what we can do is we can use this index and we can refer it back to our layer names and we can extract the names from these indices so one more thing is that they do not use 0 that's why we have to subtract minus 1 from here so this means that the value of 200 should be actually 199 and the value of 254 should be actually 250 because we are using the value of 0 as well so how can we get these values so instead of writing the complete loop we can simply write out put names it is equals to we can write our layer names now let me just write this down and I will explain what this does so I at 0 and then we will subtract minus 1 as I mentioned and then we are going to say for I in then we will just copy this oh where did it go in there so what this does is it loops so as you can see here if we run this we get three different values because we have three different output layers so for each layer we want to get the value of 200 so what we are doing is this I is basically constrained within a bracket so to remove this bracket we are going to say we want to get the first element so it's like a list so we want to get the first element of I which will be 200 and then we are going to subtract minus 1 from it as I have mentioned before and using so this will give us the index so this for example will heal the value of 199 and we want to find the name at 199 from our layer names list so this should give us the names of all our output layers so if we print out output names and if we run this and there you go so now we are getting Yolo 82 Yolo 94 and yellow 1 0 6 so these are the output names of our layers so now what we can do now we can send this image pro now we can send this image as a forward pass to our network and we can find the output of these 3 layers because we are only interested in the outputs of these layers so to do that we are going to write outputs is equals to net dot forward and we are going to send in the output names so that is what we need so let's print out print out the length of our outputs and yeah we have removed the other ones so there you go so now we are getting three different outputs so this means that is good we are getting three different outputs that we can find the boxes in so let's open this output up and see what is inside where are the bounding boxes and where are the object names or IDs or other information like the confidence levels and this stuff so let's write here instead of length let's write here type and we will run this so if we see that our output is basically a list so if we want to see the first element we can simply write zero so let's see what kind of object do we have as the first element so our element is basically of class numpy array so this means that we can simply write here if we remove the type we can simply write here dot shape so if we run this there you go so now we are getting the value of 300 by 85 so this is like a matrix that has 300 rows and 85 different columns and as we know we have three of them so let's print out the other two as well so we will write here one and we will write here too so let's run this and now you can see that the first one is 300 the second one is 1200 and then we have 4800 and all of them have 85 columns so what are these values why do we have 300 here and why do we have 1200 and what is this 85 we had 80 classes but we are getting 85 different values what does this correspond to now all of these questions we will answer in the next part and what we will do is we will go inside this output array and we will look it add the individual values and extract the box information and the information of the probability of the object being present and what is the ID of that object and then we will simply display our object at the end so stay tuned for that and if you liked this video give it a thumbs up and I will see you in the next one

Info

Channel: Murtaza's Workshop - Robotics and AI

Views: 41,470

Rating: undefined out of 5

Keywords: yolo v3, yolo object detection tutorial, yolo object detection, yolo algorithm, yolo object detection github, yolo object detection code, yolo object detection python, yolo object detection algorithm, yolo ai, yolo algorithm explained, yolo algorithm github, yolo object detection tensorflow, yolo darknet, yolo machine learning, yolo tutorial, yolo detection, yolo deep learning, yolo image recognition, object detection, darknet yolo, yolo opencv, yolo v3 opencv

Id: 9AycYn9gj1U

Channel Id: undefined

Length: 13min 7sec (787 seconds)

Published: Sat Jun 27 2020