Build your OBJECT DETECTION SOFTWARE - Crash course | with Opencv and Python (2022)

Video Statistics and Information

Video

Captions Word Cloud

Reddit Comments

Captions

you will be learning now how to build a full and interactive audio detection software from scratch for example if i click the button person that we have right here you see it starts detecting the person we can enable another button for example remote and we see also the remote control detected hi welcome to this crash course to build your own audio detection software from scratch my name is sergio i'm a computer vision developer and consultant and i help companies freelancers and students to easily and efficiently build competition projects today you will be learning how to build a software from scratch where we will be detecting different objects in real time not only the one that you saw in the preview but also other objects and you will find that on the way while you are learning but also we will see how to take the python code and to convert it into an x file so that you can run this on any windows machine without any installation so if you're ready let's go oh to follow along you need to have python it doesn't really matter the very version as long as it's one of the latest ones for example now at this moment when i'm recording the video this python 3.10 if you have 3.9 3.7 3.8 everything it's fine so don't worry about python version just you need python then you need the opencv library which you can install from the command prompt so we press start or better on the search bar we type cmd on cmd on the command prompt we can execute the commands so right here once you have python you are able to install the library with the pip command pip install opencv dash contrib dash python and then you press enter in my case i have already library so it says requirements already satisfied if you don't have them it will take just a few minutes to complete the installation once the installation is complete we can move to the terminal of python and let's write the code the first step will be to take the frame or an image if we want to do that from an image but we're going to this in real time so we take the frame from the webcam then on the frame we can do all our detection and then we can move further with the project so we start with first of all importing the opencv library import cv2 so this is the first step then how do we get the frame from the webcam we need to create a capture object which is going to initialize the webcam it seems hard in words but it's very simple we say cap equals cv2 dot video capture and then between parentheses we need to put an index either zero one two and so on this is simply say zero is taking the first webcam one the second webcam two the third web comes webcam and so on if you have only one webcam you just go with zero or if you have multiple webcams you can try and see which one you're loading so we'll go just with the first one then let's get the frame from this we get red then frame cap dot read now why do we get return frame so frame is the frame rate is saying there is frame true there is no frame false this can happen because sometimes there is some error getting the frames from the webcam so you can get an empty frame so it's no if you have a video at some point the video is going to end so the red will say there are no more no more frames so that you can choose what to do when there are no more frames now let's show this frame save it so dot im show frame so this is the name of the window you can call this whatever you want so i will just call that frame and then what do we want to show here we need to copy the exact same name so we're going to show the frame that we just loaded from the webcam once we have this we can theoretically run the code but it will you will see it is going to take the frame from the webcam and then it is going to close itself very fast so you saw windows and then it closed really fast and the reason for this is very simple once we execute this code we're going to see what are the other lines and execute the other lines as there are no other lines the code finishes and the program closes so in this case we need a weight key event so that it keeps the frame on hold and so we can say cv2.weight key zero zero it's to freeze the frame and later we will see also how to change the values and also why to change them so let's run this one and now i'm loading the frame from another webcam so this is now the result we're just getting a frame nothing else this is an image what is the second step considering that we want to have this in real time and not only from an image we don't need only one frame but we need to get as many frames as possible in real time so let's close this one you should have at least some python basics to follow this one because it will help you tremendously with this crash course but the principle is that we put the cap read inside the loop so the loop is when we finish taking one frame then we go to the next frame and we show this in a loop so it's in real time while true and then we put everything right here in a loop let's now press run and let's see what we get it's in a loop but still the frame is freezing that's for the reason why because we put cv 2 dot weight key 0. it means that it's waiting for us to press a key so each time i press a key we're grabbing a new frame from the camera you can hear the sound i'm pressing a key when i press a key we have a new frame there is a quick fix for this we can say instead of cv2 weight key zero we can say c2 weight key one and we run this one so it's not freezing the frame anymore but it's waiting one millisecond between each frame so now you can see that in real time we're getting also the frames just by using opencv with python so so far it's very simple let's now do object detection by using deep learning so deep learning is the best method to detect any object and i'm going to use now a very light model so that we can detect easily 80 different objects on any machine i will not go into depth about these on this specific course because there will be a lot to cover so if you want to know more about these i recommend to watch the audio detection video course that you can find on pisource.com now let's initialize the model with net so we have opencv dnn net so we're going to initialize the network with cv2.dnn.org on readnet we need two files one is the configuration of the model and then the weights file of the model which is the file that is trained on thousands of images depending on the object so it's the the actual file that can detect the different objects and i'm going to put the link to download these two files on the description but i'm going also to show you the link right here very quick i created a very short link so that this would be very easy so pi source dot com slash download slash crash course or d s dot zip so crash course with it finally and then we press enter and it will ask you to download a file if you don't see this if you get any error make sure that you take the link from the from description below so you are sure to download the correct file this is a zip file which is containing the mod so we're going to save this one so let's say the file and then let's open this zip file let's open this zip file and there is dnn model so we should take this folder and put this folder together with our python files so in my case i have uh only the main file is where the file that i'm writing right now so main.pi is this code that you see right here on main.file we're going to put what we're going to put we will put from dnn model so we take dnn model from the zip folder that we've just downloaded and we put this on main so there is main and then enn model and here we have three more files with the weights as i told you the configuration file and also we have a text file we will see why we have that later so let's now load these two files first we need to load the weights so dnn model slash way sorry yolo v for tiny dot wakes so this is the path let me show we have made and we have dnn model on the same folder then we load yellow before tiny weights and then we load the configuration dnn model yolo v4 tiny dot cfg how do we make sure that we're loading them correctly well there is no better way than to run the code if we don't get any error we are sure that we are loading them correctly so now we should see getting the frame from the camera we see that so so far everything is good now we have the file loaded in case you get any error make sure that the path is correct for example if let's say that i don't put this i don't put the correct path so i just skip the folder that's what i get opencv failure to parse net parameter file and yet it says also which one give you the error so make sure that the path is correct then we can move further once we initialize the network also we need now to define a model this is an object detection model so let's define model equals c with c dot d and n detection model now each model is different now we're using yellow if we had other models like mobile net or other models different versions of all also we will have different parameters so in this case we say model dot set input params and we need to pass two specific parameters one is size equals let's say 320 by 320 why this because when we process with deep learning even a 4k frame or like a very big image the deep learning network doesn't process the bigger the image with its size but it's going to shrink the image doesn't matter the the shape of the image even if it's a rectangle 69 resolution 69 proportion it doesn't matter it's going to resize that into a square usually very small square for from like in this case 320 by 322 can be 1024 by 1024 and it's important that there are multiple of 32 so in this case we can use 320 or we can change the size later bigger size it means better precision on detection but also slower on the detection so we need to find the good the good balance between them so like the sweet spot we have a good detection but also a good speed then we have scale is one divided 255 we have this because in opencv the pixels go from zero to 2055 while on the deep learning neural network the value go from zero to one so we need to scale them in order to be processed by the neural network so this is mostly a very technical thing that is happening on the back which you shouldn't care about but it's good that you know why we we're scaling them scaling is about the pixels value from 2055 from from zero to one now we have the model so we have pretty much everything that we need we need to perform the detection so what do we do once we have the model to perform the detection and the second step will be to actually detect what we have on the image so here we load at the beginning the neuron the neural network then we initialize the camera camera then we get the frames let's put a comment so it's more organized get frames and now we have object detection object detection right here audio detection how do we do the audio detection we have the model so we say model which is exactly the same that we initialized before dot detect where do we want to attack the objects of course we want to attack the objects on the frame so this this is the frame rate here and what do we get in exchange so when we detect an object what do we usually get as information there are three things that we get we get the class which means the category to what the object belong to so if it's it depends on how it's structured but in this case it's class zero it means that that that that's a person class one it's buy stock bicycle class two is a car probably not exactly this but each class it's a different object so we have class ids is the first information that we get then we get the score and then we get the bounding boxes b boxes actually i'm not really sure if we're getting them in this order but that's what we get i need to to check to make sure that the order is correct we will see that right now so the bounding boxes is for that specific object where the object is located so we have a rectangle surrounding the object so we get mostly two coordinates we get the top left point and then we get the width and the height of that object and then the score how confident is the algorithm about the detection so if we are 100 surely it's a person it would be confidence one if we are 20 sure that it's a person so it will be confidence 0.2 now let's see all of them let's print class ids class ids and then class ids so that i can prove you what i am saying so with score it's plural because we have multiple objects so it's not only score but it's scores then print bounding boxes boxes and then b boxes so when we load the model also we need to pass the networks let's see now how this will run and if everything is written correctly no typos we should see the printing okay that's what i wanted so now i'm going to to stop this one considering we have frame after frame of course we have this printing in a loop we have class id ids with zero then we have scores 0.7 something then bounding boxes and then we have the values this is already some good information which is a good start because if we're getting them it means that the code is correct but now we need to use them so how do we use them we are going to show them in real time by showing them i mean we're going to draw a bounding box then we can add the text okay what is this object and so on considering that these class ids scores and boxes contains multiple objects we need to loop through them four and now we we go we extract everything so we have class ids we want class id so single one of each of them class id score b box in this is zip is the function to extract at the same time in a loop multiple arrays so we extract class ids then scores then b boxes and let's start now with the simplest and most important one let's draw a bounding box around what is detected so we can do that easily okay bounding box i show you before what we were having on bounding box so we have four values x y width and height equals bb box like this so this these are coordinates let's now so i will quickly print x y width and height and then if everything is working correctly we will then draw them on the screen so to represent the bounding box so it's working so we have this we have the values so these bounding boxes and this is the value separated that i was printing x y width and height let's use them we will use them to draw a rectangle cb2 dot rectangle to draw a rectangle we need a few parameters we will see them one by one first where do we want to draw the rectangle we want to draw the rectangle on the frame so most likely everything will be done on the frame that we have so we get the frame and then on the same frame we draw the rectangle now point one the point one is the top left point point one will be x and y and then we need a second point so we need top left and right bottom top left we have it right bottom we don't have it because we have only top left and then we have width and height from the top left plus the width and the height we get to the right bottom so we have x plus width and then we have y plus h and we have the second point now we can choose a color of our rectangle a color is made by three values we have the blue green and red how much of green how much of blue do we want from zero to 20 55 let's say 200 how much green do we want let's say zero how much red let's say 50 so of course the mixture of different of the three primary colors colors will give us any color that we want thickness of the rectangle let's say two or let's say three so the rectangle will be really thick and very well visible that's everything that we need for the rectangle so let's now run the rectangle and the rectangle let's now run the code and let's see our detection okay now we see we are detecting a person which is me and that's already a good start so we have obvious action in this case we have person now we want also what else do we need so this is our only a basic starting of our program but that is much more to it i will go now and write what are we detecting so how do we know if we are attacking a person let's see now if we it can pick the mobile phone so it's not the best detection so far but we will prove that later don't worry about improving detection so now we're detecting two objects person and mobile phone but we are not saying anything specific about them so we need also to tell okay what that object is so let's do that oh in order to say what object we are using we need to load the classes and the classes are together with the file i made you download so do you remember before i made you download three files uh one file which extracted was three files dna model inside we had yellow before tiny weights yellow v4 tiny cfg and then we have classes.txt these classes contains a list of all the objects that our deep learning model can detect so in order we start from the class 0 the first is person so if we detect class 0 is person the second is bicycle car and so on so if you want to to know what your model can detect just open this file check the list and you will see what this can detect just for your information this is coco data set so coco data set is a data set which contains should be around one 120 150 000 it's a lot of images with all these with all these objects and the model was created from these images so this is just for information if you were curious how this was created now we have the list okay of course we don't have to go and look the list the model the the code should we should tell us what is that object so we need to proceed with this in order to get that we have class id class id will be for um let's show now class city we show the cluster dcv2 dot put text let's put the text where on the frame of course everything will be on the frame now what text do we want to show let's show class id of string because class id is a number while we want to show a text and the text must be a string so we convert class id to string with us tr and then class id then where do we want to place the text let's put the text together with the object so that it will follow the objects because that's what really matter we have this on x x position and then y minus let's say five y minus five so that it's not it's together with the object but it's not overlapping the object on the top left point so we have x and y let's put the text a bit up somewhere here font phase it doesn't really matter right now the font let's take any font font hair shape plane so this is the font of the text it doesn't really matter so at least now this is something that we can make nicer later we can choose a font but now any font that shows us the text will be just fine font scale one is the size of the text let's use one color let's use the same color that we use for the rectangle also for the text thickness of the text let's say two and now we can run this one and let's see what we get so we should get the object then some text above each object detected and that's what we see we have we have the square and then we have the object uh zero okay let's maybe increase the size of this text instead of one let's say two let's give also maybe some more space 10 pixels and let's run this again now we have okay we have the object detected then we have zero if i put the phone if i can we have 67 so we have 0 and 67 what does represent this 0 and what does represent 67 it's very simple when we have the class txt so the element 0 is a person so first element is a person so 0 person somewhere 67 okay here it starts counting from one but 67 is the cell phone so exactly we have the exact order with the class id with this list so if we load this list then we can easily say okay what is this object with this list let's do that all right here after we load the dnn model let's load the classes load class lists like this way classes at the beginning it's empty because we don't have them yet then let's load them with open then what do we want to read we want to read the classes txt file that we have on dnn model so you have that as well as it was together when you downloaded these files open then dnn model slash classes.txt uh are because we want to read the file as file object so this is the way to open the txt files with python uh let's now loop through uh file objects read line because we have multiple lines so for class name in file object dot read lines let's now print this class name so we make sure that we are doing this correctly so let's run this one okay so i i run this one and then i did stop the codes of a person you see that there is a space because after on the text file when we go on the new line there is what python sees as backslash and so we need to remove also these spaces and then we can put them on the classes so our goal is to take all the values from class name and put them into this array so class name plus name will be equals to classname.strip and then let's let's put these on classes classes dot append class name and let's now print classes so like oh let's say objects list and let's print them let's print classes and let's run this one i'm going to quit now so we have person bicycle car motorbike airplane bus train and so on and so on so with these few lines i did nothing more than taking this zip file zip this text file and put this file into an array so each value so we have person bicycle it can be easily called so if we say from classes we want class with value index 0 of course this will be person we have person right here with index 1 it will be the bicycle and so on so as you see this it is now going to be very simple for you to understand how we can use the class list that we have later right now we have the class id from the class id we can get the class name in real time class name equals classes classes and then on classes we use class id and instead of showing frame instead of showing class id let's show class name right here okay we don't need even string it's just class name and let's now run this one so if everything is working correctly i mean if there are no typos now we should see person which is working very well so far then we should see mobile phone cell phone which is also detected very well if you are if you get an error on this specific step so if you get an error on this line right here it means that you're using an older version of opencv before the version 4.5.4 this is different so if you have 4.4.0 or older version you should use it this way so you need to put class id and then also index 0. if you have version 4.5.4 or later just follow along as i am doing right now class name classes so we have now the classes we have also the score at the moment we don't care about the score let's keep this very simple maybe we can make some adjustment just at the end when we're working with this and let's now instead move to the next step let's now improve the resolution of the camera we can also increase the size of the frame process by the object action deep learning model so that we can have also better accuracy so let's do these two things we will start by increasing the resolution of the camera where do we do that of course we do that where we get the camera frame so of course on the well more than the frame because we got the frame already after we have initialized the camera so when we initialize the webcam we choose the resolution cap equals c2 video capture 0 is the webcam initialized then on cap we can have many settings that we can change like a lot of settings we can change the buffer size we can change the camera resolution we can change also the exposure of the camera the autofocus and all these things that a camera can have we care only about the size of cap dot set then cv2 dot capital letter cap underscore then we have the autocomplete first of all let's decide the width so normally the cameras start from a small resolution of 640 by 480 where the first value is the width then we have hd 1280 by 720 1 1280 by 720 so let's do that 1280 by 720 so this is the width cap dot set cv 2 dot cap prop frame height and then this is the height width and height of hd of course uh you cannot choose the resolution we want so you cannot just put any number you will get that these are standard resolutions that the camera have hd is 1280x720 then we have full hd i'm going to put a comment full hd hd will be uh what is full hd 120 and by by 1080 then we have 4k with 8k and so on i recommend to just go with this size because the more you increase that the harder it would be for the computer to process the information now how do we make sure that we have the camera resolution correctly set it up well let's run this is the camera if the screen now is bigger it means that we did a good job if you get an error it might be that your camera doesn't support this but mostly i'm using now on this example the standard the standard laptop webcam and you should have this as well so person it's working well with person let's see with mobile phone not so good let's now also increase the frame process by the deep learning model for example we can check where we have 320 by 2020 let's go with 416 by 416 and let's see how this will run you might see that this is slower just slightly lower because bigger image usually better resolution slower slower processing it's not always doesn't mean that if you put the image 2000 by 2000 that you will get a better resolution of course the model has some limits and with that we don't improve that it doesn't improve that much cell phone so let's keep this for the moment by 320 320 if you're wondering about the precision and if you think that the precision is not good this is because this is a very small model and i'm using this small model because i want that everyone will be able to run this on any computer you can increase dramatically the accuracy using a 10 times bigger deep learning model you will see a huge difference on the detection you will be able to adapt with heavier models smaller objects but for this tutorial the steps will be the same it will change just change the size of the model so let's stick with this for the moment now we have this set it up so far let's now make this more interactive so we're going to add some button that we can click so if we click person we display only person if we click another object we display only the other object and so on so we the idea is that we add a few buttons we click on that button and we display that in real time so let's do that how do we add a buttons we can create them manually by setting a mouse callback function let's do that right here let's create a window create window window and then cv2.name it window and this would be frame so the i'm creating a window because otherwise i cannot associate the mouse click with that window so when we create a window it must have the same name of this one so right here with in show we display the frame window when we want to set up mouse callback we want to create that window first of all before displaying anything then on that window cv2 cv2.set mouse callback what is the window name is frame then when we click the mouse we want to execute some function and right now is um click button we don't have the function yet so let's create now the function we can put this function anywhere let's just put it right here click that click button it will be event then x y flags so don't worry about all these okay it's event not events don't worry about all these values we don't have to do much with them that's only the way we used to call the mouse functions so we say if the event is cv2.event ma l button down so if we click the left button of the mouse we can for example print x position and y position so let's now run these two so i want to prove you that everything is working correctly that when i print with the mouse we get the position so ideally we create later a button so when we click the button we know that's the button and everything is working so we have a lot of output so it's even hard to see this but let's maybe hide some of the outputs that we have because we don't need them anymore so the printing of class ids boxes i'm going to comment all this because we don't care about them right now as we're using them already so instead let's have a clean output and only we will see when i click with the button of the mouse we will see on what position i am clicking that's the window the output if you see right here the python output is clean now i'm going to click so we have the click we have the position x position and y position of where i had the mouse so it's hard for you probably to see it but let's now click it here now we have the mouse on another position 125 560 and so on ideally i'm going to put different buttons and if we see that the button is where the mouse click is then we can activate that button otherwise we can deactivate that button so let's do that right now and we will do this this way some click button event x and y uh the mouse click concept is very simple because we are going to draw buttons so if the click is inside that specific rectangle so it means that we are clicking that button or rectangle that we've just created if not then we're not clicking it so very simple uh we have here the coordinates where we should look is this x and y inside the button how do we know first of all let's create a button so how do we create a button we can go right where we have the code and it doesn't really matter for the moment where we draw the button so let's do uh that just after the detection so create button how are we going to create a button we're going to simply draw a rectangle save it to that rectangle where do i want are we going to draw the rectangle let's draw the rectangle of course on the frame because we have only the frame so that's what we use the rectangle needs to point ideally we're going to draw a button on somewhere on the top left of the screen so we draw a few buttons on top of each other uh point one let's start from position 2020 so position 2020 is from the top left let me explain how the positioning work in opencv so if we have this is the opencv window it starts zero zero right here zero zero so of course the coordinate is x and y so we start if it's 20 20 it will be here so 20 starting from the left 20 starting from the top counting down if it's the position a point uh one on uh 100 and then 20 it will be first value so 120 100 of course i'm using the mouse note for writing so it's not the best but you get the point so this is 120. we start counting from the left so 0 one two three until 100 pixels so this is the first position and then left to right the first value then from top to bottoms of zero one two until the position twenty so this for example can be the point one hundred twenty so let's draw the button somewhere here so we draw let's do something like this we have rectangle so from point 20 22 point 20 22 100 let's say 50 150 and then uh 70. 150 and 70. what is going to be the color let's make this red zero of blue from from 0 to 25 we give zero of blue 0 of green and let's make a red not so strong so from 0 to 255 we give only 200 and now we want to fill the circle so we say -1 if we give a value positive value so for example 2 3 it will be how thick is the border of the rectangle if we give minus 1 it's there is no border it's completely filled with the color the rectangle so if i run this one we will see now the result we have right here the rectangle the red rectangle so it's 33 we're just showing a rectangle but that's how we use we create the button uh the next thing that we can do right here is to put a text okay what do we want to display right here so if we click there oh let's put a text see that put text where do we want to put the text on frame of course now the position we want to put the text somewhere inside the button this is uh not intuitive in general because we need to put the coordinates manually so opencv is not the best for the graphic for the user experience you should use something like pi qt for designing user experience software but now for simplicity we're going to do it this way c2. text frame uh now what text do we want to put let's say person so ideally if we want to detect only pairs on the person category we just click there and we activate the person then we will add a few other buttons if we want to detect the other categories we click the other buttons we will see that step by step so don't worry if well at the moment it seems hard it's not person 2020 so where do we place the text for the text the text let's say if this is the text person right here person okay this is enough i will not write anymore here we need to give this point right here the bottom left of the text so we give this point and then we draw the text the bottom left if this is 20 20 and the height is probably 70 i don't remember the height is 70 so we somewhere here it should be around 30 of the x and the height could be around 60. let's do that 30 and 60. of course if you have hard time with the coordinates this requires practice so you can just draw the rectangle in different position put the text in different positions and it will come natural uh to you after some trial and error now font face c2 that font as before we don't really care now about the font font actually actually plain font scale let's give three color let's make this white because we have red so let's make some good very high contrast white how do we make the white the white is all the colors together so the mixture of the maximum of the three primary colors so 255 to 55 will give us white the opposite would be the black where is the absence of all the primary score primary colors so zero zero zero would be black thickness all the text will be 3. and let's run this one just to make sure that everything is correct and the positioning is correct and let's now where did this runs okay it's running okay the text is quite big and the the button is small but you get the idea now uh and we i'm going also to of course increase now the the rectangle so that the person can fit completely inside that and then we check if we are clicking inside the rectangle so that's a very important key now put text person okay let's increase the rectangle instead of 150 let's say 220 so we just make it wider 2020 and then we need to check the click okay how do we know if we are clicking inside the rectangle well we should make somehow match these positions so the rectangle we've weaved these x and y so we need to check if x and y is inside the rectangle how do we do this we have the function cv2 dot point polygon test so let's let's do that now it's not probably the best structure but it's the most intuitive let's now take the coordinates of the rectangle this one so i'm going to say it's not the most intuitive now we need to convert the rectangle to a polygon because the opencv function works with a polygon to check if uh if a point is inside that so let's do that quickly and this will be also good practice for you so polygon how can we draw a polygon which is a rectangle so we take we take four points so instead of giving top left right bottom we take top left top right bottom right and bottom left so polygon is 20 20 is the top left then we have 20 20 20 20 20 20 20 20 20 20 20 20 70. and then to then we have 20 70. so we have the four points which make this rectangle so 20 70 and i can prove you that because instead of rectangle i can say now c two dot fill poly so we fill a polygon on the frame what are the points pts color let's make these zero to zero zero two hundred and then that's it okay because this fit polish so pts it will be polygon for polygon and i'm going to run this we will get an error but i will explain why so now nothing should change we're just converting the rectangle to a polygon and this is a good practice for you now we are getting an error we are getting an error because we are passing something the coordinates in a way that the field poly function doesn't accept them and this is important to know in computer vision that when we use coordinates with opencv they need to be not simple arrays so it's about the race not simple arrays but numpy arrays so let's import import numpy smp np.array and that's how it should be now i'm going to run this and i will quickly explain why first i need to make sure that it runs correctly that there are no typos and later i can explain why we're doing this okay let me let me change this okay also the type of the array must be corrected okay so this must be arrayed inside an array this is not of course the most intuitive function that exists but that's how it works and why numpy there is a very simple reason because numpy is much faster so numpy is a library which has python api so we can call it with a very simple code but it's very fast to process the information while if we use just python array python is a very slow language and incomplete division especially when we work in real time so now we're working frame after frame it's essential uh that we have a good speed to process around 30 frames per second as you see now with first one and it's exactly it's just a rectangle but we just converted that inside from this rectangle to a polygon now let's take this one okay let's make a copy of this one polygon and let's now check one thing now when we click the mouse with the left button of the mouse we're getting x and y and now we have also the polygon in order to make sure that we are clicking inside that button like this we need to check if is x and y inside the polygon and we have is inside let's call it this way cv2 dot point polygon test how do we check this we want to check the polygon and then we have the point x and y and then false we're just going to check if it's inside if if we say true right here it's measured distance so if it's outside it can also tell how far is from the polygon we don't care about that we just want to know is inside or not so we say false if is inside is greater than zero so it means that it's inside print we are clicking inside the uh button and let's print also the position x and y let's run this one so i can prove you that it's working and you can prove it yourself by running the exact same code or with the adjusting that you want to change if you want to change anything of course you're free to do that so let's click now i click outside we're just printing x and y so nothing if i click inside we see we're clicking inside button this is the position 207 444 so what can we do right now that we know that when we're clicking inside or outside the button we could get a switch so we say if we click inside either we switch it on or off so uh i can let's do it let's create a button now it's not the most intuitive we are going also to clean the code so just follow just if it's out for you just make some struggle because i trust me it's worth it let's write your define but on person is false okay we're clicking inside the button so we want to to activate the button if it's not activated and we want to deactivate the button if it's activated so it's a toggle button if it's inside so if if button person is false button person equals true alif button person no okay let's say else else button person is false button person doesn't exist right here because we need to bring this inside the function to bring this inside the function we can just define global button person and let's print print now button person is and we say if active or not active button persons are true or false and let's now around this one let's see how this works okay now we kick person now about what the person is true now we click person now is false we click again now is true let's now understand how to make a condition what to show on the screen so now we are detecting person anyway if if we click the button nothing is happening because we are not doing anything with the button but we want to understand how to do that so let's go on the code and this will be very simple because we have class name right here and each time we detect the class print class name we know exactly what we are detecting each time among the least of course so let's run and let's see what we get so of course on each frame there is person so if i put some other object let's see if it can get this one cell phone we see also cell phone somewhere right here we can choose what to display we can say for example if class name is person is or not easter it's equals to person then we can display person so right now we are making an if we are putting an if condition on the class name so only if we have person the box the bounding box will be displayed so now if i show the phone for example there is no way that this is going to be detected as a cell phone because we have only the condition for class name person so if i say for example if class name is cell phone of course i'm not going to be detected because now there is the condition for the cell phone so let's see i'm going to preview that so now there is no bounding box but if i put the cell phone you see now we have cell phone and now let's use the class name but let's also use the switch button so if class name is person and what is the second condition so we want to make sure that the class name person but also that the person button is activated and where button person button person is true then we display that so now we have some potential so now that we the program is interactive because we can click the mouse to activate or deactivate in real time the detection of person now by default person is not activated now if i click anywhere on the screen so i'm clicking you see we have the coordinates where i'm clicking nothing happens if i click on person right now now button person is true is activated and we are detecting so we have the bounding box around the person if i click again we deactivate that if we click again we activate that we can add many buttons for many for as many objects as you want at least they must be on the list at least for this small project of course you can create your own custom detector i have that one it's more advanced of course i can't cover that i have that on the course how to create a custom that on the audio detection full course on pistols.com how to detect on any object but right now we have 80 different objects that we can easily detect for this project let's now do one thing let's add more buttons but now i know that this seems very hard and not intuitive i just made these buttons from scratch to show you how you can easily find a way to create things from very basic functions so we don't have like the button function on opencv but we can draw a rectangle we can draw the text we can choose if to change the color what color to put on the text and on the rectangle so of course we can create a button and we can create whatever we want on the window because we have a lot of freedom to make things very simple and now as i don't want to waste a lot of time on creating a lot of buttons i have created a library it's a file which is called gui buttons.pipe which does that dynamically so we have the functions to add the button to display the button and we can attack when we click the button so i've created this one that you can download this so i will give actually this one when you download the dnn model you will find also gui buttons dot pi and so we can create the buttons with just around one line of code to create a button and then a couple of lines to just activate the buttons you will see how great is this file that i've created and how this will make your life easier so i'm going to remove the buttons that i've created right now because we don't need that so all the polygon that we have right here we don't care so let's just keep the function click button very simple from click button to if event and then print x and y let's remove this condition right here and let's display now everything okay let's also remove where we have the function create button instead you will need to put the file which is called viewbuttons.pi together on the same folder with the main.pi and we're going to import that we're going to import from from gui buttons we are going to import buttons now let's initialize initialize buttons button equals buttons and with this library that elaborate it's more a file in our library itself that that i've created it you can download we can add a button very easily button dot add add button now what do we want to display of course the text must be some of the the classes of the list so from the classes.txt we can choose the object that we want to display so let me pick person so we're going to add person person right here now the position of this so we need to give only the top left point of this button let's say 20 20 and then the rest is calculated automatically so 20 and 20 top x 20x 20 in the y now we have already the button just one line so it's very simple you don't have to go through all the hassle that i showed i've just shown you instead now after we created the button let's now display the button on the screen because now we have the button but if i run this one there is nothing because we are not displaying that on the on the screen so we need to take the button and display it on the frame and of course to make things very simple for you i have also created a function for this display buttons button dot display buttons where do we want to display buttons on the frame and anything else for the moment let's run this one and later we can also see how to work with that and how to activate it deactivated and what to do to change the objects that we display now we have a button which is called person now if i click on this nothing is going to happen when we click the mouse we need a function to make the button activate or deactivate how do we do that right here we say if event is event l button down button dot co click button click now we need to give where do where did we click we click on x position and y position x y and now let's run this one so we should have now at least this activated now the button is not activated now i'm going to click right here and it's automatic right here that we get when it's activated it's full color and this is not activated and you see is false or true now how do we get the list of the buttons and also the classes so how do we choose here the condition i have a thing for that which is get list of the active buttons because this will work with multiple buttons and we will see that very soon get active buttons list active buttons equals button dot active buttons list that's all you don't have to really do much so we print active active buttons active buttons right now we see there is active buttons it's an empty array if i click person we have active button is person we can add multiple buttons so we have class name if class name before we were checking if class name is for example person then we can display or if class name is banana we can display that we can do this normally when we want to do this but we have when we want to put some condition but we have a better way if class name is in the active buttons so if class name is on the list of the active buttons we display them otherwise we don't display them so let's try person and you see now person has this violet color let's see cell phone says cell phone is somewhat between red and brown it seems at least keyboard keyboard it's similar to cell phone then we have remote and css let's try all of them so we have now the remote and we have c source and then you can try this with the other objects that you have what you will be now thinking of and it's one of probably the most important thing is to deploy the program so once we have the program which now is ready let's go and make the program into an executable file so that everyone can run it even if pattern is not installed so this is very easy you you make the program into an x then you put it on any computer it will be ready to run without any installation let's do that so before we convert this into an x file it is an important thing to add regarding closing the window because now this button doesn't work if i press any key on the keyboard it doesn't work so what we will do is to put on a key event so that we can click any key on the keyboard and we can close this so on the code there is the last line which is cv2 dot weight key one we want key equal cv2 dot weight key one so this one is waiting for us to press a key if we don't press a key it goes to the next frame and shows the next frame and so on if we press a key we can decide what to do with that key for example if the key equals 27 27 is the s key on the keyboard in this case we can quit so we can break the loop so if i run this one now i press the s key and the program stops there is also another thing that it's good practice to add and is cap dot release because we need to release the camera because if for any reason the python code is running even if you're not displaying anything it will keep the camera on hold and so if you run the camera on any other software like skype or zoom it will say the camera you is used by something else some other software python so we always need to release the camera and also let's destroy all windows c2 destroy all windows so we run this one and now we click ask and of course we this code we are more sure we have a warranty that we're going to release the camera and close everything let's now generate an executable file for windows this means that your program will be converted from python to an x file and you can use this on any computer without any installation because of course now you need python you need a few libraries when you put everything inside into a package you can take this to any computer without any installation and this is great when you want to deploy the software you want to give the software to people who don't know anything about python so that they can run and use what you just built and this is essential of course uh for commercial projects so let's do that with python and it's very simple because there is a library which will do everything for us first of all let's install the library we go on cmd so we run the command prompt this is a python library so we can install this with pip install cx underscore freeze and installation of this library is very quick i have that already installed so in my case we have requirement already satisfied let's now close this one once we have the library installed we need to create a setup file which is a python file where we simply give a few instructions about our program so we put the setup file together where we have the main folder so let's do that new python file setup dot pi so where we have the main file the same folder we have main file we have set up gui buttons and we have the dnn model subfolder what is the setup file about here we give a very few lines of course you can personalize this in many ways this cx-freeze library gives gives you a lot of options and i will put the link of cx freeze on the blog posts if you want to know more about x files for the moment let's import from c x underscore freeze which has capital letters in this case so keep this in mind because if you put the lowercase f you're going to get an error so c x freeze uppercase f we are going to import setup and then executable now we write setup and setup is nothing more uh now a function where we just give a few instructions first of all what is the name of our app of course you choose the name of your app i will call this uh object detection software uh let's say simple because it's very simple there is nothing advanced about this software then version equals 0.1 it doesn't really matter this is just to have an history of your app so if you make some update you you can say version 0.2 it's up to you like this is just a text you can pretty much put whatever you want right here then we have description equals and it's the same with a simple description of our software these softwares the software detects objects in real time and then executable so we're executable right here we need to put the path where we want our program to appear and then right here we need to put the main file that we are using where we have our code so normally we python i'm running the main dot pi file so this is the file that has all our code our program so we need to put that one right here on executable so we'll put main dot pi of course if your file has a different name you need to put the exact same name and now our script is pretty much ready for the setup normally we will need to run this from the command line so i open cmd and we need to run python then we need to put the setup.pi path so in my case it's a very long path so i'm just giving you an idea and later i will just build it so if my program was on c so let's give the path c program and then setup dot pi and then build that's the program that the the the line that we need to run python c program set up or of course here you need to put your path so check the path where you have the file put right here your path if your path has some spaces on it just remember to put the quotation marks before and at the end so if your path was c program or mine with the space use the quotation marks and then build and then you press enter i'm not going to do that because i have multiple versions of python installed so it will run the version that i don't want so i'm going to use pycharm if you have pycharm just follow exactly what i am doing on the setup.pi file we select right here setup we edit configurations and we go exactly on parameters so parameters right here will be built and we pray then we click apply so with this change we're running the program with the argument build and then we can run this one so i'm going to run setup if everything is working correctly you will see running build and running build x and it is going to take a few seconds to a few minutes keep in mind that this is going to be your python library internetx files if you have other libraries installed that are not for this tutorial this x most likely is going to include all the libraries that you have for python so i recommend to start and do this with a clean version of python because otherwise it will be a very large x file and you might encounter some errors now the building is complete and we can now check that exactly on the folder where we have our program so we have setup main and you will see a new folder which is called build inside build we're going to have x win and then right here we have our main x file we're going to run this one and first of all it's not running because there is one reason we have the main file let's also put together the dnn model so i'm going to copy the folder the folder includes classes yellow tiny configuration weight let's copy this folder and let's put the folder together so we go on build then x win md64 and then we paste dnn right here there is our entire program so if i run this one this is the program running realtime from the x file if i take this so the only thing that i need to take is this one we take this x folder we can move it in any computer even without python it will just run so it's so great and so simple to run anywhere when you do this and of course i can prove you that it's working so we have person we have cell phone it's working exactly as it was working before and on the other side we have also a window with the output we can of course hide this window there is some there are some instructions to hide this window right here when we compile with the setup.tie i'm not going of course right now into that because it will take a long time and it's not necessary for the purpose of this video this is all for this crash course and i'm sure that if you've got to the end first of all congratulations if you've got till the end because most of the people just start with one tutorial and only a very tiny small part get till the end and with this for sure you learned a lot of things regarding basic functions of opencv object detection even if we light a tiny yellow model and very important how to prepare a python file and put it into an executable file for deployment so that you can use these on any computer you want let me know what you think about this tutorial and what you would like to see next because even if i don't always reply i always read all the comments and when i see some common topics or ask quite often i usually make either a free video either i put some some more material on my courses and so on this is all for today see you on the next one

Info

Channel: Pysource

Views: 91,217

Rating: undefined out of 5

Keywords:

Id: bUoWTPaKUi4

Channel Id: undefined

Length: 80min 17sec (4817 seconds)

Published: Tue Jan 11 2022