Realtime Object Detection Using OpenCV Python ON CPU | OpenCV Object Detection Tutorial

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
hello everyone and welcome back to the channel in one of our tutorials we did object detection with yolo v4 using opencv but it requires to have gpu for decent fps and was just giving us two to three frames per second on cpu in today's video we will try to implement real-time object detection using opencv on cpu let's get started first you need to install latest versions of opencv and opencv contrib using pip command i already have them installed as you can see i have two folders here model underscore data which contains pre-trained model which is available at opencv official documentation i also have this list of class labels on which the model was trained this is standard list for coco dataset which you can find on their official github repository i have uploaded everything on this github repo which you can download coming back to visual studio code i also have these two videos that we will be using for real time object detection all right let's create two files detector.pi and main.pi let's start by import cv2 numpy and time modules now we will implement object detection class called detector the constructor of the class will take four arguments video path config path model weights path and class labels path we need to assign these arguments to class-wide variables which start with self these variables will be available in all the methods that we will be implementing within this class let's now initialize the network and set some parameters we will initialize cv2.dnn underscore detection model and pass model path and configuration path as parameters the network is initialized as self.net and then we need to pass the image size on which the model was trained which is 320 cross 320. all the images would be scaled between -1 and 1 and against each channel mean value would be subtracted which is set to 127.5 across all three channels as opencv loads images in bgr format we need to swap r and b channels to make it rgb all of these parameters and settings are available at opencv official documentation so that's it our model is set up now we need to read class label list from coco.names for that we'll define another method called read classes then we can open the class path and store all the entries in the file as a list inside a variable called self dot classes list let's print self dot classes list to be sure that we have correctly loaded the file this method then should be called within init method above there we go now head back to the main file and let's import everything from detector.pi and also import os module let's define main function and declare a variable video path in which we can give path to the one of the videos that we have next thing that we need to define is configuration path which is this pbtxt file for that let's use os.path dot join so that the path to the file is automatically inferred by the os module as the path is inferred differently on windows and linux and opencv sometimes returns error while loading the model if we hard code the path the folder which contains the configuration path is given as first argument and the config file name is given as second argument similarly we define model path which is frozen underscore inference underscore graph dot pb and finally the path to class labels list which is coco dot names inside model underscore data all right now we need to initialize our detector class which will take these four parameters we can just quickly copy them from the class implementation and paste here as the variable names are exactly the same now we call the main function and run the code we can see that the glass labels are loaded and printed but there is a problem index 0 have person class but the model predicts 0 as background we can fix it by adding an entry at the index 0 of the class list and we are going to call it underscore underscore background underscore underscore now if we run the main code we can see that background class is at index 0. now the mapping of the model predictions would work just fine with our class list let's get back to detector dot pi and define another method called on video so we open the video through cv2 dot video capture and pass the self dot video path as parameter if the video is not opened successfully we can print error and return from the function otherwise we read the first frame from the video by calling cap dot read method we are storing the frame in a variable called image and it also returns a flag if the frame capture is successful next we have to run a while loop as long as the frame is successfully captured now we need to provide this captured image to our network then define confidence threshold for detections which is set to 0.5 this method would return a collection of class level ids their confidence scores and bounding boxes found in the image let's convert the bounding box and confidences to a list we also need to do a non-maxima suppression of bounding boxes by calling dot cv2.dnn dot nms boxes so this method basically eliminates all the overlapping bounding boxes and the threshold for the overlap is set in nms underscore threshold so it would return the indexes of the bounding boxes that have an overlap below that threshold i will show you at the end what i mean by that let me just close this so that we can see the complete code all right now we have to see if the length of non overlapping bounding boxes is not zero then we will proceed with drawing of the bounding boxes on the image let's run a loop for all non-overlapping bounding boxes and extract bounding box from the original list but at the index that is returned by nms function it's going to be an array so let's squeeze that to get the integer same goes with class confidence and we can extract that against the bounding box index from the original list of confidences now we need to extract the class level id from the original list of class ids against the bounding box index this class level id is an integer which represents the class index and we can extract textual label against this index from classes list now we can unpack the bounding box to get its x and y coordinates as well as its width and height this information will be used to draw the box on these coordinates let's call cv2 dot rectangle method and pass the image to it then pass the starting coordinates of the bounding box which are x and y and then the ending coordinates which will be equal to x plus width of the box and y plus height of the box for the color of the bonding box let's go with white which can be set with 255 255 and 255 for all three channels and the thickness of the rectangle would be equal to one pixel let's show the frame by using cv2 dot im show method now we also need a mechanism to break out of the loop so we will record the key presses and if key is equal to q we will break the while loop otherwise we will proceed with capturing next frame in the video by calling cap.read method and once the loop breaks we need to destroy all cv2 windows let's go back to the main file and call on video method we can see that the bounding boxes are drawn on the objects but all classes are represented by white color so what we need is a unique color against each class let's go to detector dot pi and scroll above in read class method we can define random colors using numpy so random dot uniform and start from zero to two five five we need colors equal to the length of class list and there would be three channels so this would give us randomly generated unique color against each class now we do have a class label index we can get glass color from our list of colors let's convert each value to integer and extract the color from the list against the class level index now we can replace this color here and there we have it each class has its own unique color but the class label and confidence of the bounding box is still missing let's go back to detector dot pi and create display text let's use string format and there would be a string then a float with four decimals and then pass these parameters to dot format method here we have the text let's call cv2 dot put text and then give image and display text so i want to put it on the top left corner of the box so starting point is x and a little above the box so y minus 10 would do it it's displaying the text as well but four decimals are not looking good so let's convert it to two decimals great but there is one more thing each time we run the code a new random color is generated for every class if we want to generate a fixed color at each run we can do that let's fix the random seed of the numpy to let's say 20 and now if we run the code multiple times same colors would be generated for each class so that's great one more thing that i want to do let me put a marker here i want to draw corners for each bounding box with a thick line to beautify it let's see how we can do that let's draw a line of 30 pixels in each corner we can call cv2 dot line method pass the image to it and then x y coordinates for starting position of the line then the ending coordinates which would be x plus line width and y would remain the same it would have same color as class and thickness of 5 pixels let's see what it did here we can see a line in the left corner let me show you on second video here now i also want to draw a line on the y-axis as well so let's copy this line and this time i want to draw from x to y plus line width here we have it now for the right corner let's copy these two lines and paste here this time i want a line from x plus w and y coordinates to x plus w minus line width and y coordinate and the second corner would be x plus w to y plus line width there so perfect let me quickly draw following two corners as well so let me put marker here there you go it looks beautiful but there is a problem with small bounding boxes because we have fixed the width to 30 pixels we should make it relative to the size of bounding box so let's fix it we can change the line width to 30 percent of the width this would give us float so let's convert it to integer cool so that fixed small bounding boxes as it's now relative to the bounding box width but i guess as the bounding box would grow horizontally in size we would have a problem yep here we can see that the car has weird bounding box let's also consider height of the bounding box and get 30 of height as well and get minimum values of these two so that worked great and has fixed this issue now let me explain what i meant by nms threshold so let's go back here and change this to a big value let's say 0.8 let me change this video you can see that there are overlapping bounding boxes on this traffic light and overlapping bounding boxes are not being eliminated because we have set the value to 0.8 so let's change it back to 20 which is 0.2 and now any bounding box having an overlap greater than 20 would be eliminated now we can see that the light does not have two bounding boxes okay so final thing that remains is to show fps so we can calculate that by initializing start time with zero outside the while loop then get current time divide one with the difference of current time and start time and we have got our fps then we set start time to be the current time now we need to put this text on image let's do that before i am sure we can write fps and then concat the fps by converting it to an integer and then to string its location would be the top left corner 20 pixels offset from x-axis and 70 pixels offset from y-axis font is plane size 2 green color and thickness of 2 there it is it's 17 to 18 fps which is almost real time it can give you a higher fps if you have a higher end cpu i'm just running it on a laptop so let's run on second video same fps and now how you can run this code on your webcam just change this parameter to zero and you would be using it on the webcam so that's it for today's tutorial if you have learned something of value today hit like and subscribe to the channel and i will see you next time [Music] you
Info
Channel: TheCodingBug
Views: 37,826
Rating: undefined out of 5
Keywords: object detection opencv, opencv object detection, opencv object detection python, object detection on CPU, real time object detection, cv2 object detection, mobilenet ssd object detection, mobilenet ssd object detection tutorial, opencv object detection tutorial, object detection, object detection using opencv, object detection python, object detection with opencv, object detection with opencv python, real time object detection on cpu, object detection opencv python, opencv
Id: hVavSe60M3g
Channel Id: undefined
Length: 17min 11sec (1031 seconds)
Published: Thu Oct 14 2021
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.