Object Detection using OpenCV | Python | Tutorial for beginners 2020

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
hello everyone if you want to learn line by line code implementation of this live video demo for object detection please stay tuned till end of this video we will be starting from the scratch about the concepts what is the difference between object detection and image classification and till end of this video we will be implementing line by line code so even though if you do not have any prior knowledge to deep learning concepts what is object detection and image classification this video is for you guys for implementation we will be using python as a programming language and for ordering these rectangles and text data we will be using opencv similarly as we are targeting the beginner's level students so it's uh in this video we will not uh train the object detection classifier rather we will be using pre-trained deep learning architecture based on tensorflow so in this regard we will be using opencv to load already pre-trained tensorflow frozen models let's get started my name is shanullah and i'm a phd student at in high university south korea and my major is computer vision and deep learning at the end of this video you will also deploy our object detection algorithm for live webcam demo so okay before diving into the implementation let's have some brainstorming as you can see over here that this in this image we can see that there is a dog there is a grass and background as well but if you give this image to any deep learning classifier such as alex not google net mobile not this will be classified as a dog why because dock is the most salient feature of this image even a human will definitely classify this image as a dog similarly for this image we will classify it as a fish right because in this image the most salient feature is fish similarly in this regard dog is the most salient feature and elephant is the the one which is most salient but what about this image it will be unfair if we classify this image as a dog or a cat because both of them are the most salient features so this is the problem of this video right so the solution to this problem is object detection so that we can classify both of them and localize so for simple image classification and recognition most of the classifiers are alexnet googlenet and all and mobilenet and vgnet right and many others as well so in 2012 alex net was the pioneer of the image classification challenge which beat the human capacity by 150 percent so it was a major breakthrough in the deep learning so after that many deep learning architectures are revolutionized day by day the famous data set used for image classification is nat which contains 1000 classes so however for the object detection the famous data set is coco having 80 classes so as you can see the object detection is basically a combination of classification and localization so for example in this regard we will use two the like two algorithms like we will be using the combinations of two or more algorithms right so for example mobilenet was the image classifier in previous we we can see that in previous slide we have shown that mobilenet is the image classifier right so we will use single shot multibox detector the combination of these two for a better object detection what happened in it for example there is image right so what single shot multi-box detector do right it it classify as for example it sorry divides all the uh image into small patches and then combination of the patches based on the most silly feature it joins those uh patches and ask the classifier okay like mobilenet and other whatever is the combination so it asks the classifier to classify this image for example in this case if there is a canon dog image what ssd mobile net does that it will divide into different patches and then combination of the patches it will make a most salient one and then it will ask for from the classifier right so right so it will ask from the classified so it is the combination of two algorithms similarly we do have other algorithms like yolo you look only once which is also a very famous one but the most lightweight and faster one is ssd mobile net and in this regard the most recent one is ssd mobile net version three which is uh among all like version one version two and version three is the best one right so now we have defined the problem that what is object detection now let's dive into implementation but before diving into implementation one another very important concept is image segmentation so these are the three important concept like image classification object detection and then image segmentation so what is image segmentation so image segmentation is to classify rather than classifying an object it is to classify each pixel over here you can see that for example this man this is a person right so every pixel has been classified right it it shows like red so all the persons in this video is shown to be right similarly the data set is different like cityscape previously for the object detection the famous data set is coco and for image classification it is imagen so definitely we need to care about the data set as well but in this video we are targeting only the concepts and just the implementation so let's dive into the implementation the ingredients we will be using in this video is for example anaconda which is the package containing different ids id is the one in which we program our code right so we do have spider we have vs code and jupyter notebook but we found the jupiter notebook the best candidate for this tutorial why because we can make notes and a lot of other features we do have in the jupyter notebook you will see in this video after installing anaconda and jupiter notebook at the end of this video uh sorry uh i mean after installing these two we will install opencv let's get started for the implementation in order to install anaconda all you need is to just write in this google search bar download anaconda and once you write it you will find the first link if you click the first link you will find at the downside of this link many executables so based on your specifications either you are using windows macos and linux you can just download the corresponding executable right once you download all you need is to just press next next next and it is super simple to install now i assume that you have already installed the anaconda so in the windows search bar if you write the anaconda like this so you will reach here like anaconda navigator if you press the anaconda navigator you will reach to this page over here you can observe that we have green buttons with install instruction and blue buttons to launch right so in your case as you install the navigator so all of them will be green and you can just click the install button of the jupyter notebook now i assume that you have installed an anaconda navigator and jupiter notebook once you installed the both of them it's better to create a new folder right so in my case it's full but in your case the folder will be empty right so create an empty folder for your project then then again you should write anaconda and this time rather than clicking an account on navigator you should press the anaconda prompt right so once you click this let me zoom for you all you need is just cd and paste the path and then change the drive and just write jupyter notebook once you write the jubilee book okay so a window will be appear like this right so all you need is you just create a new python uh jupyter notebook and let me separate for you guys okay okay the first thing we need to do is to import opencv right so import or cv2 is the command in order to install opencv definitely uh at this moment you have installed the anaconda navigator and the open uh and jupiter notebook so far so pip install open cv dash python is the command to install the opencv right so all you need is to just visit anaconda prompt again and this time make sure that you run it as and run as administrator once you run it uh you all you need is to just copy this command and just run it right so opencv will be installed okay the second tool we need is to import mat lib dot py plot as plt right so in order to install this you need into pip install not plot lib right so again like run as an administrator then a conda prompt and just write this this command and you will be good to go right so now i assume that you have installed these four candidates such as for example the opencv and matplotlib okay so let me uh tell you how to start where to start the for the object detection right okay so this is the page official github link i will provide in the videos just description so once you visit over here you will find many deep learning architectures already pre-trained uh models available and they are based on tensorflow right so they are base although they are based on tensorflow but you can just download and you can use opencv to load them that's what we are doing going to do right now so the most uh recent one is mobilenet ssd version 3 that is discussed in the beginning of the video so all you need is to just write and save link okay or you just click the click it so it will be downloaded right and another important thing is one are the weights like the your deep learning architecture and the second thing is your configuration if you click to the configuration you will be here right so as you can see over here what is the com the configuration like what is the input size and a lot of other information is available so you need to load it as well so just download zip right you once you download zip for example let me download for you guys so once i press download it will be download without some the name like this so in my case i have already downloaded and copied in my folder like this is my graph and this is my okay so this is the version uh this is the graph of the deep learning architecture which i have recently downloaded so when you once you extract the folder you will find a lot of them so the most important one is frozen inference graph all you need is just copy and paste it wherever you are going to use so this is my project folder i just created this uh untitled new uh python notebook so i have just placed over here right so in any folder you can place wherever you are working right in the project folder similarly if you visit this if you extract this the to the graph configuration you just need to copy this configuration and just paste it wherever you are working right so now as i have these two candidates inside my folder where i'm working okay so i can just load it right okay so let me load them so let me give them names for example my configuration file is this and and secondly if you have in any other folder you don't need to copy so you can just provide the path right over here so in my case i have copied in my main folder so i need to execute so before diving it uh it's better for example okay let me load the model so the model is equal to sorry cv2 dot dnn if i press the tab i will have multiple options right so the most important one is the dnn detection model right so there is one method and the second method is for example if you write cv2 dot uh dnn sorry dnn dot uh for example if you press a tab and then you can just uh read from the tensorflow as well but i will follow the like this method right not the run from tensorflow but we do have another method as well so in this case i can give the frozen model is the first candidate and the second candidate is config file right so now okay so now in my memory i have loaded the tensorflow pre-trained model right so this is my model and this is my configuration file right so once i have loaded my model so another important thing we need to have is this dark net right uh labels so in this case all you need is to just uh for example i will also provide the link of this dark net cocoa names okay so all you need is to just copy all of them right so there are 80 classes as i told you before in the coco data set there are 80 classes right so these are the labels so once you have for example i have already the labels uh okay these are my labels so let me show you what are my labels okay so i've just copied in a text file so all you need is just create a text file like this okay and create a text file and just copy all of the labels right so now the third thing i assume that you have already copied the labels right and just downloaded okay so once you download the labels why we need labels right so in order to check our answer is correct or not right so that is why we need labels okay so um let me write a code for you for loading so label.txt i have just copied in my folder right this is my label.txt containing all 80 classes okay so i can just execute like this i can also do using a pen but i'm over here i'm not using doing it using a pen but i'm just doing it like i'm reading the label and creating the list of python and just doing it like this so let me print for you if i print the class labels that have already been appended so you see we have a list of this all 80 classes right if i check the length like this if i print the land there are 80 right so we have 80 classes and we now we have loaded successfully loaded the model and now we are good to go right so for doing it let's read an image right okay so in order to read an image uh what we need okay so the good thing is to solve we can just convert right into nodes right so in order to read an image okay we can just uh read any image right so this is the command see we do do diamond read okay so before reading the image is the one very important thing we need to consider if you check the google uh over here as i showed you that in the configuration file if you visit the configuration file in the configuration file you can see that the size is 320 by 320 right and there are many other uh changes as well so what we need to do is over here if i load for example this image and let me show you plt.i am show image so sorry i i think i haven't copied this image okay let me check either this image exist or not ah okay this image do not exist so let me load this image make sure to have the image inside your folder actually i was okay okay um what is happening uh it may be not png so let me check either it's png or jpg so you see over here it's jpg format right let me view it for you guys large icons so it's a jpg file right so make sure that to put the right format specify as well the format should be same right if i write like this then you see over here so now it's a bgr actually so we have bgr right so blue green red is the uh by default using see opencv right so if i convert into like this so you see over here if i convert it to rgb this is the actual image right so these are the thing like we need to convert bgr to rgb and many other settings we need to do right and as i showed you that we have a 320 320 size right so this is the configuration we have we have just loaded right so this is the configuration we have loaded you see so this is the configuration so now the the total configuration right if i press model okay let me to save my time okay so model. you need to do like this once you loaded the model like this using this command so another thing important you need to set up the configuration so input size is 320 by 320 why because in this configuration file we have defined it's 320 by 320 right and another thing is that two five five divided by two two five five is the gray level for white right so zero two two five is mostly gray level so it's uh we are converting into the for scaling we are doing it right so another thing is that is 127.5 we are taking the mean the why is so because uh mobile land this this mobile nut uh takes the input as a minus one into one right okay so this is the reason why we are we are doing it right like this right so it takes minus one into minus one right so you have to k take care of the configuration of your architecture similarly we have set input swap rb so as you see over here we did like this like uh bgr to rgb so we don't need to convert it each time right so it should be automatic so when we did what did uh swap our be true so it will be automatically did by itself right it will perform automatically so now once we do it okay so the most easy thing is okay so once i have model so i can just you see i can just write model dot detect you see so if i press the tab it will be automatically done right because i have already loaded the model so class index conference b box so this is the three output of this and this is the threshold based on your accuracy you want if you want to have more accurate then you can reduce and you can change it we have just fixed it for 50 confidence you see so if i print my class index so if there are two things like one and three so what is one and three so if you see over here one is actually if i start from zero it is one so i will reduce minus one i will show you but this is one and this is three right so one is person and one is car so you see we have two labels one is person one is car right so it's successfully detect so if i check uh you see over here it is not flattened so in order to do it so we can just we need to flatten right so we you can see over here we have two lists right so we have to flatten before doing it so okay so this is the code for like for example i'm setting my font like what is my font size and what is my font type and then i need to have a loop like class index config conference boxes in zip so uh if we have three different uh folders sorry three different variables we need to do zip and uh everything i'm flatting why because they are com double um braces right so that is why i'm doing it for flat end for b box it's not so that is for people for the box we are not doing it and we can have different uh like co2 dot rectangle means that plot this box having r bgr right blue so the rectangle color will be blue and this is the scale and similarly we have put tags like this is my text whatever i write over here it will be displayed over here so the class index the class label we have class labels right we have 80 class labels you see in the class levels we have this so we want to display this one right but it is at zeroth index so as you can see over here there is a offset of one over here to see so i need to reduce minus one wire so once i execute this command and similarly the font color is zero five five so bgr so front color is green and bgr and the rectangle color is blue so if i shift enter and execute it and then i need to plot again let me plot again my image right so i have modified my image you see now you can see over here that it has detected the person and the car right so very efficiently it detected the person and car now as we can process one image at a time now it's time to go for the uh video demo right in order to deploy this code uh for the object detection in case of the video all you need is to just write this code like cv2.video capture so it will just load this uh mp4 video file so it mostly support mp4 and other formats you need to check and once you do it you need to write this code for example if it is corrupted or you cannot open the video so these are the commands and similarly this is the same like for scaling my font and this is the type of my font and you can do it while true like one but you can do it until end of that as well but right now for the simplicity i'm having infinite loop so what is this like cap uh cb2 dot video capture is a file pointer to this video like it will be pointing the f to the first image of this video so the video video is nothing but it's the combination of the images right so usually we have on average 25 frames per second right and for high resolution videos like 4k and others we have 60 frames per second right so uh video is nothing but a combination of images so we are reading cab.read okay so now then we will have image in it right so as you can see over here previously we just have loaded the our model based on our configurations and the frozen graph right and we do have all the labels as well so all the settings we have put before that and now if we write this command right model.detect uh so i have executed already these commands so that is why i'm not doing it again so all i need this command okay so i'm just doing it right so in your case if for example if you haven't executed previous commands you need to do it right because model uh will not be saved in your case so you need to execute these commands so i assume that you have already executed all previous commands so model dot attack now in this case rather than image i should have frame frame so frame by frame we are reading the video and then for example in this case uh we need to make sure that it must not be zero right so there must be a frame in it right so we are double checking over here and over here as well so now this is the previous code okay but one more extra con condition i have just put because sometimes it gives you an error or it more than 80 classes or other things so make sure that class level must be 80 because we have only 80 classes right so we have only 80 classes so i'm just in order to avoid crash the video so that is why and again similarly the color of my rectangles are blue and the color of my text data is green right so bgr bgr right so now if i execute this so this is the video i'm just doing it let me show you the video first okay so this is the video right as you can see i have just recorded this video from youtube okay so you can just record your video as well from youtube or anywhere and even if you do have your own video you can just process it so if i execute this command you see over here this is the same video is the same person which was coming right now so over here you can see that in the start i showed you this video so uh as well so it is detecting perfectly person car and all others and it is uh doing it like uh all right so we have a while one so it should not be while one but uh anyways it is perfectly working right so the same code we just run so you can download it you see that when the bus comes it is detecting the pass and the car comes it is detecting the car so now it's time to have a webcam demo right in case to have a webcam demo live webcam demo it's better that you just for example you modify the previous code into one right so in my case i have only one webcam so based on the number of the webcams you have so you can change this code but it's dynamic one right the code is very dynamic so it is finding zero or one uh this is for example this is webcam so all you need is just change this this is just my added instruction so the most important one was that you need to change it to one right so once you do do change to one the rest of the code is same if you execute it you see it can successfully detect the person and have a move you see if i move so it can detect it as well so it's real-time demo in your case i assume that uh you might have other uh objects as well so you can just try it uh using multiple objects in your using your webcam and you can just deploy real-time uh webcam demo thank you so much thank you for watching this video
Info
Channel: DeepLearning_by_PhDScholar
Views: 212,531
Rating: undefined out of 5
Keywords: object detection, python, opencv, deep learning, image classification
Id: RFqvTmEFtOE
Channel Id: undefined
Length: 29min 29sec (1769 seconds)
Published: Fri Oct 16 2020
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.