Object Detection Removing Duplicates | OpenCV Python

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
hey everyone welcome to my channel in the previous video we created a mobile net ssd object detector now it was detecting objects fairly well but the problem was that sometimes it had overlapping objects and it was not able to determine whether these are the same objects or they are different objects so if we had just one uh cup for example it might show as uh it might show us as two cups which is really bad if we want to count them or if we want to use that information so the idea behind is that if we are able to remove this overlap then we will be able to detect more efficiently and we will have a much smoother output so let's get started okay so before we start with the actual implementation let me just explain a little bit the method we are going to use is called nms and nms basically sounds very scary because it is a technical term but in fact it's very simple nms basically stands for non-maximum suppression and what that means is that let's say i have these two boxes and both of them are detecting car now what i want to do is i want to remove the one which does not have a high threshold so it will compare the threshold of these two boxes and it will suppress the non-maximum one so it will keep the maximum one and all the rest it will suppress or remove so that's how the name comes non maximum suppression so let's say this box here says it's a car and it was 80 sure and this box because it has more content it says that it is 90 sure that it's a car so what the non-suppression uh method will do is uh it will remove the middle one because it its threshold is lower and it will keep the outer one because its threshold is higher and that's how we can apply non-maximum suppression so how can we apply this to our code now the good thing is opencv already has a function for this and if you have seen my yolo video then you might be familiar with this function so this is the code that we wrote in the earlier video for our object detector and if you haven't checked this out i will put the link in the description and it's just almost how many lines it's almost 30 to 40 lines of code and it pretty much runs a very efficient object detector that you can use in different platforms and the best part about this uh method is that it's a very good balance between accuracy and speed so we are able to almost get same amount of accuracy from yolo but with a much faster speed even on cpu so that is why we are using this now uh let me recap what we did we have the video or the webcam over here and here i have added a few parameters this is for the brightness and then this is for the width and the height so 3 means width and 4 means height these are the prop ids one thing you can do is if you don't want to carry this file everywhere wherever you have your project then you can simply write the actual names but uh writing those names will be really time consuming so what you can do is you can write print and you can run this and this will give you this list so you can literally copy and where is it you can write instead of class names like this you can paste it here and that's it and that should work exactly like before so and and then of course you have to remove these extra brackets and then if you can't see it properly you can press enter and put it in different lines and oh boy there is a lot of lines as you can see here and is it in the screen it's going a little bit out of the screen uh okay so that is fine oh almost there okay so here you can see all these are all the classes and this should work exactly the same as before but we're not going to go into that much detail we will get it from the file itself okay so next we have the files the configuration file and the wait file for our uh model and once we have that we simply send it to our detection model and we set the settings or the parameters and then we ask it to classify or detect our and then we ask it to detect the objects in our image and we give the threshold which is 0.45 or 0.5 whatever you want to keep and after that we are simply displaying the information based on the class id the confidence and the box so this works well now what we have to do now is we have to add the code for non-maximum suppression so let's have a look at how we can do that so the first thing we have to do is actually fairly simple we have to write here the indices this is basically the output and then we will write cv2 dot dnn and then we will write dot non-maximum suppression and boxes and then we have to give in our parameters so the first one is the bounding boxes so we will write b b box and then we are going to send in the confidence values so confidence and then we have the threshold so this is the exact same threshold as before and then we have the threshold for non-maximum suppression so how much effect do you want this method to take how strong of an uh suppression do you want it to have okay so then we are going to write here a value so actually let's just declare it like this and we can just go up and over here we can write let's say 0.5 okay and um this value the lower it is uh the more it will have an effect so it will uh reduce the boxes more if you have a lower value so zero so one basically means don't suppress anything and 0.2 for example means suppress quite a lot okay so where were we okay so this is basically the line that we have to uh use to use our non uh maximum suppression boxes function but the thing is that uh it's not very simple that we simply send in these values because it's coming from somewhere and we are sending it somewhere else so most probably we need a conversion now the problem here is that here we need a list and over here we need a list as well so the list over here should contain float values and the list over here should contain bounding boxes as lists as well so let's see what kind of values do we already have so we will um okay we have print already so we can write here type and okay let's print type and we will also print the box itself so we will print the box and the type so let's see what that looks like okay what is happening um is it running main yes let's remove this i think it's trying to run this so maybe just remove that okay we do have a value so that is good okay so because it's a webcam it will keep giving us this okay so what we have is a class tuple and uh inside right now we don't have anything most probably because it's not detecting anything whenever it detects these are the box values so it should be like this so this is a list and inside that we have uh another list in which we have our bounding box but it's not actually a list it's a numpy array so we will convert this into a list so here we are going to write let's say our bounding box is equals to list and we will write here bounding box and that should work let's run it okay so this is basically coming from my webcam and you can see that there's a lot of overlaps anyways but we are looking at yeah so it is a list now so this is a class list and inside that we have our array so you can see that we have this but again it's numpy array and i think that should be fine we will try to run it if it doesn't work we will change it okay and then we need to convert our confidence values as well so let's print out the confidence values so we have confidence values and then we need to print the confidence values let's print that out and there we have it so it is a numpy array and then we have this uh these values inside and okay so what we have to do is we have to remove these brackets so instead of these brackets we need a comma in between so we need a list we need an actual list so to do that i will use the numpy array method and i'm pretty sure there are other methods that will work better but this is the method that i am familiar with and if you are familiar with something better please put uh the method in the description or sorry in the comments and i will have i will be happy to add it to the code okay so here we are going to write our confidence values are basically uh our confidence values dot reshape we can reshape our uh what you call array and then we can write one and minus one and let's see what that gives us okay and we have now um yeah so now we have a list it's kind of a list but uh we still have one extra bracket so we can remove that by just saying zero and that should remove it yes and it has removed that and now we just need to convert this into a list let's see if that works okay list object has no attribute flattened okay so we need to hide all of this we need to comment this out because we are going to use another method to print it uh to display it so let's see and there we have it so this is the format that we need we need a list in which we have values uh of the confidence so this is what we need okay so that is good but the point uh one more thing is that we need to check what is each value inside of this so for example let's take the first value and we will print it we will print the type of it so let's print that out and it's a class numpy float 32 so we just want it as float we don't want to associate anything with numpy so what we can do is we can write uh so we can write here map so we want to map the values of float to our confidence values so this will but this will return a map so we are we want a list so we will write list over here so now if we run this class float excellent so this is what we need now it's pretty much ready to go to our uh non-suppression uh non-maximum suppression function so let's go up here and let's remove this and let's comment this and once we do that let's print out the indices and there we have it so if i okay there is an error tuple object has no attribute reshape okay uh what can we do okay let's change this to numpy numpy dot array and let's try that uh did we import numpy no so we will import um numpy as np so let's run that okay so now it's working so basically we just made sure that it is a numpy array so that it has that functionality of reshape okay so now it's giving us the indices and the idea is that before before sending it we have the indices and we have basically the bounding boxes and what it does is it looks at all these bounding boxes and based on their index values it will return the ones that you should output okay so it will eliminate all the ones that you should not include so for example the bounding box had three uh values and two of them are unique and the third one is a repetition so the indices will include only two of these and it will tell which one to eliminate it will not actually tell you it will eliminate it and it will send you the output so now what we can do is we can only output the ones that actually have uh the correct bounding box that are not uh overlaps so what we can do here now uh here it's a little bit tricky because we will have to write for i in indices we are going to loop through these indices and if you have seen we have an extra bracket here so for each of these values we have an extra bracket so what do you do to remove that bracket you just put a zero so you write here i is equals to i zero and that will remove that bracket so now we have the bounding box uh index so what we can do is we can get the bounding box information so we can say box is equals to our bounding box and we want the bounding box at this index okay so that will give us the information of our given index and then we can um then we can write let's say x y w and heights and then we can write box zero and then we can write box one two and three three two and one and then we can simply write the rectangle part here and here we can write the x and y so x and y and here we can write the x plus width and the height plus y so that should give us the correct uh bounding box and should we try to run it let's run it okay and yeah so now we are getting the bounding boxes that is good and now you can see that uh it's not that flickering there is no not much overlap okay so next what we can do is we can write the names of these classes so that we know what we are detecting so let me write cv2 or i can copy it from down where is it uh this is the class idea so i can just copy this and we can paste it here first of all let's remove that go back and yeah okay so what we have to do now is we have to change this part because this is basically our name and it is a little bit tricky not that much but it's a little bit tricky so earlier we had class id and what we were doing is we were uh just subtracting 1 from it but this time around we have to get it of a particular index and that index is this i so we will say that from class ids we want the index i and it already has an extra bracket so we will remove that bracket and as we did before we have to subtract it with one so we will add minus one and that should do it so let's run it and see what happens and there we have it so we have the cup we have the laptop uh the cell phone again the cup we have the remote keyboard and let's try my mouse we have the mouse here here you can see we have a tv should it be a tv or should it be it should be a laptop actually sometimes it's saying laptop but anyways that's not a big deal so this is how you can basically add the non-maximum suppression and you can see it has affected a lot now the the result is much more smoother than before now if i wanted to change this value let's put it at one which means the nms will not work and let's see what will happen so here you can see already the cup you can see there's a lot of overlap and then on the keyboard there's not much overlap on the laptop even but here you can see oh this is a bird here in the middle you can see sometimes it's giving an overlap then in the cell phone as well it's giving some overlap then for the remote not so much for the mouse it's fine so a lot of the places you can see a lot of the objects it's giving these overlaps and if we don't have that nms it will keep giving us these overlaps so what we can simply do is we can put a harsh value and that should give us a good amount of stability so now you can see it says cup looks much nicer it's confused between cup and bowl which is which is not bad like it's pretty much the same category so laptop keyboards cop again uh remotes stream deck forget about it uh maybe a teddy bear teddy bear oh it's a bottle no sometimes just a teddy bear and then we have a book does it detect a book no it's too small to be detected as a book i guess then we have the keyboard and the tv tv it should detect as laptop you know that should be a laptop i don't know anyways so overall uh it's much better than before and as you can see uh it has improved quite a bit and now we can use it in many different applications where we have to count the objects or we have to have a certain amount of objects that we want to display now this is it for today's video i hope you have learned something new in the next part we will try to replicate this on raspberry pi and jetson nano and some of you have been asking how can we detect only one object or two objects or three objects so what we will do is we will create a modular code in which we will send our input arguments and based on those arguments it will give us an output for example we could say that i just want to detect the cups and i don't want to display the result i just want to know the bounding box information so the the module will do all the processing we will just send it an image it will do the processing and it will give us the result so this is something that will be very useful in the future projects in the jetson nano car that we are building in the raspberry pi project the ultimate robot so all of these uh projects it will be very helpful and so stay tuned for that and if you like this video give it a thumbs up and i will see you in the next one
Info
Channel: Murtaza's Workshop - Robotics and AI
Views: 26,427
Rating: undefined out of 5
Keywords: obejct detection, nms, non max suppression, non max suppression explained, nmp opencv, non max suppression opencv, object detection opencv python, object detection opencv, object detection python, fast object detection, opencv python, mobilenet ssd, opencv mobilenet ssd, ssd object detector, object detector mobilenet ssd, ssd mobilenet, deep learning, deep learning opencv, dnn, dnn cv2, cv2 object detection, computer vision object detection
Id: diWDgKcH3E0
Channel Id: undefined
Length: 23min 54sec (1434 seconds)
Published: Tue Sep 01 2020
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.