TensorFlow Object Detection COMPLETE TUTORIAL | 40 TensorFlow Object Detection Models

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
hello everyone and welcome back to the channel in today's video we will learn how to run 40 object detection models using tensorflow these models include famous ssd mobile net and faster rcnn as well as a recent state-of-the-art efficient debt we will be writing a generic code that will be capable of running object detection on windows and linux alike we will start from a clean slate i will guide you through the installation of tensorflow gpu along with supported versions of cuda and cu dnn and in later half of the video we will be coding our detector class for object detection on images videos and webcam you can find the timestamps in the description below let's get started first thing that we need is anaconda python distribution i am on windows so i am going to install using windows installer but if you are on linux use the respective distribution then on windows we need visual c plus redistributables download and install the latest version available now open anaconda prompt as administrator because we will be installing a bunch of stuff now start by creating a virtual environment called tf underscore gpu let's create this environment with python version 3.9 because it is supported by the latest version of tensorflow [Music] hit y when prompted once anaconda is done creating the virtual environment activate it by command conda activate and then virtual environment name which is tf underscore gpu in our case hit enter now we can see that the base environment is changed to our virtual environment let's clear the screen so to install the latest version of tensorflow gpu first head over to the official tensorflow webpage and if we scroll down we can see which version of cuda and ceo dnn are required for a specific version of tensorflow as we are interested in installing the latest version we will need cuda 11.2 and cu dn 8.1 let's install it by command conda install cuda toolkit equals 11.2 and cu dnn equals 8.1 minus c equals conda-4 [Music] once that is done now it is time to install tensorflow let's install version 2.6 which is the latest version at the moment so pip install tensorflow gpu equals equals 2.6 it has finished the installation let's check if we have successfully installed it and our gpu is detected run python import tensorflow and then test tf.test.org is underscored gpu underscore available everything looks great but it says that this method will be replaced in the future so let's use the other method as well so it also detects the gpu finally we also need to install opencv for loading the images and videos let's install it by command pip install opencv dash python that's it our development environment is all set all right we will be running all these models that are available at tensorflow object detection zoo these include all versions of efficient debt ssd mobile net and faster rcnn these models are trained on coco object detection data set and there is always a trade-off while selecting a particular model for example there are some models that have lower influence time so are suitable for running on cpus or mobile devices but they have lower performance some have higher influence time but are also more accurate so the choice of the model really depends on our use case okay so let me show you what i have got here i have a test folder with a bunch of images and videos that we will be using to run the object detection and there is a file called coco.names which is available officially with cocoa dataset i will put the link in the description this file contains the names of all the classes on which all of the object detection models are trained there is one edit that we are gonna do first class label for this file should be background so let's add it and save the file now let's create an empty python file called detector.pi vs code is asking to select python environment so let's select tf underscore gpu environment that we have created in our detector.pi file import cv2 time os and tensorflow as df also import numpy as np and from tensorflow.python.keras.utils dot data underscore utils import get file let's fix the random seed of numpy to let's say one two three so that we get reproducible results for all the runs now we will start implementing the detector class let's define init method which will be empty so just pass it as we are not implementing anything in this method now we need to read the class names file for that let's define read classes method which takes class file path as argument we will be opening this file in read mode and then split all the lines and put the text in a list we are calling this list self dot class list so remember any variable with prefix self is available anywhere within the class now we also need a unique color for every class label so we can randomly initialize color values using np dot random dot give lowest value 0 and highest value 255 and the size of this list would be equal to the length of our class list and there would be three channels great now let's print the lengths of class list and color list to see if both are equal okay now we need the driver program so let's create another empty python file called run underscore me dot pi and import everything from detector class let's initialize detector class and create an object called detector and call this read classes method it takes classes file path as parameters so let's define that class files equals coco.names let's run the code and here we have 92 classes and 92 colors now we have to download any model from the model zoo so suppose we want to run ssd mobile net v2 so just copy its link and let's save it in a variable called model url now head back to detector.pi and define a function called download model which takes model url as an argument we already have model url but we need to extract file name from it it's this last part with dot star dot gz extension we can get that by using os dot path dot base name and it will give us the file name from this file name we also want to extract the model name which is this part without the extensions we can do that by taking the characters of file name till we hit the first dot let's print the file name and the model name and see what we did in main file let's call download model method and pass the model url and run the code here we can see that we have extracted the file name and model name successfully we will be using this information in subsequent code now we define a cache directory where all the models will be downloaded and stored so i am defining dot slash pre-trained underscore models which means that the cache directory would be created at the same place where we have placed the detector.pi file we need to create this directory and if it already exists we will skip it let's now run the main file and we can see that this directory is created now we will use get file method to download the object detection model so let's call get file and then pass the file name that we just extracted above after that we have to pass origin url which is stored in model url parameter then we need to tell cache a dir where the model will be downloaded it's self dot cache dir and then sub directory inside cache dir which i am defining to be check points as the model will be downloaded as an archive we want to extract that so now let's run the main file we can see that ssd mobile net is being downloaded from tensorflow object detection model zoo because that is not present in our cache dir in the cache directory we can see that there is a checkpoint directory and the model file is downloaded and extracted if we run the code again this time the model will not be downloaded as it is already there in our cache once the model is downloaded we need to load this model using tensorflow so let's define another method called load model and here i am printing the message loading model and then concat the model name okay now let's clear the keras session and load the model in a variable called self dot model let's call df dot saved underscore model dot load we need to give complete path to the extracted object detection model so let's use os dot path dot join then there is cache directory then check points and model name and finally we have saved underscore model folder so basically i'm mentioning the directories that contain the saved model let's print the message that model is loaded successfully let's call load model method in main file and here we can see that the model is loaded successfully so it's working great so far now it's time to make some predictions on the image so let's define method called predict image which takes image path as input read the image and place it in a variable called image let's show the image using cb2 dot i am show and then wait for a key press so 0 here means that we will be waiting indefinitely but if we pass any integer let's say 10 it means that cv2 will wait for 10 nanoseconds for a key press and finally destroy all opencv windows let's define image path in the main file and call predict image method and pass the image path as parameter here we have the image of course we have not performed any object detection yet so no bounding box is there let's define another method above which takes the image as input and makes predictions we are gonna call it create bounding box and it takes image as parameter this image is loaded by opencv in bgr format so let's convert it to rgb format and call it input tensor this is a numpy array but we need to convert it to tensor and tf dot convert underscore 2 underscore tensor method will allow us to do that and its data type would be unsigned integer eight so tensorflow takes a batch as input so this is just one image so this should be batch of one so we need to expand its dimensions there we go now we pass this input tensor to the model and get detections these detections are a dictionary so we need to extract bounding boxes and convert it to numpy array extract class predictions which are basically index of the class labels and let's convert it to integer array then extract the confidence scores for each class label and convert them to numpy array now let's determine the image height width and the number of channels we will use this information in a moment to calculate the locations of the bounding boxes so if we have found any bounding box in the image we want to proceed with drawing each of the bounding boxes we need a loop till the length of the bonding boxes array and at the index i we will be extracting one bounding boxes as a tuple for same index i we also need to get class confidence let's convert class confidence to the scale of 100 and round it now against the class index we need class label from our classes list and we also need to get the color that we defined against that particular class the class label and the confidence will be displayed on the bounding box so let's use string format to create that text first part will be the class label and the second part will be class confidence now we need to unpack our bounding box to get the values of the pixels at x and y axis these would be y min x min by max and x max values let me print these values to show you what we have got here let me put a break here to end the loop after first iteration let's call the create bounding box method here in predict image and it will take image as a parameter all right let's run the main file and here you can see all four values these are basically coordinates of the bounding box but these are relative to the height and width of the image and are not absolute locations so we can get actual pixel values or actual locations of the pixels by using the width and the height of the image let me show you how so let's remove this break and print statement we can get actual pixel locations of x min by multiplying x min with image width x max location by multiplying x max with image width y min by multiplying with image height and y max by multiplying with image height also and of course it would give us a float value so we need to convert it to integers there you go now let's call cv2 dot rectangle method and pass image to it then starting point of the rectangle which is x min and y min then ending point which is x max and y max color of the rectangle would be glass color that we have extracted above and thickness would be one let's return this image on which we have drawn all the bounding boxes this function now returns the image so let's call it b box image and let's show this here instead of original image we can also store this image with the name of our object detection model so that later we can compare the results now let's run the main file and here we have bounding box for each class in distinct color but there is a problem there are a lot of overlapping bounding boxes we can actually clean that mess by using non-maxima suppression let's call tf dot image dot non underscore max underscore suppression and pass the bounding boxes and class scores to it then we need to define how many bounding boxes we want at max so let's say 50 bounding boxes and then iou threshold which defines the percentage of overlap of the bounding boxes that will be allowed so make it 50 percent and then confidence score threshold of the bounding box let's also make it 50 now this would give us indexes of the bounding boxes that meet this criteria so now these checks would be based on this new bounding box index and in the for loop we want to draw the same bounding box returned by non-max suppression method let's also print the bounding box indexes here so here we can see now that only those bounding boxes are drawn which had a confidence score of 50 and an overlap of less than 50 percent and those were the bounding boxes 0 1 2 3 and 4. let's change the image here and here we have the result let's make this threshold a parameter to be passed to our methods so that we can control it from the main file these hard coded values will be replaced and this predict image now takes the second parameter as well let's create another variable here and pass it here okay it's just for the convenience for later use all right now we need to put the class label and confidence text on the bounding box as well so let's call cv2 dot put text method pass image display text and location would be x min and a little above pi min so minus 10 pixels above the bounding box would work define font its size color would be the same as class color and thickness of two here we have the label and confidence scores as well but if you do not like the colors of the bounding boxes we can change this numpy random seed to get different set of colors i'm changing it to 20 i also want to beautify the bounding boxes by making the corners a little bit bold so we will draw a small line in the corners and the line width would be proportional to the size of bounding box in pixels and we can get the size of bonding boxes in x-axis by subtracting x max and x-min and then multiply it with 0.2 to get the 20 percent of this size and of course then convert it to integer same goes for the y axis and once we have both of these values we want to get the minimum of these two let's now call cv2 dot line method pass the image start position of the line that would be x min y min and the end position that would be x min plus line width and y min let me put marker here run the main code there is an error we must have missed something there it is it's not integer it should go inside the bracket here there we go and we can see small line is drawn here so this corner is basically ax min y min this one is x max y min and this one is x min y max and final one is x max y max so now if i want to draw a vertical line here i will have to draw from x min to y min plus line width let's copy this line and just change this to x min and this one to y min plus line width similarly copy both of these lines for second corner so x max y min x max minus line width by min and x max y min x max y min plus line width let's copy all four lines and paste here i want to put marker here to avoid mistakes let me quickly draw other four lines as well there we have it let's run the code and we can see nice and pleasant bounding boxes one more thing is that i want this text to be in caps so we can convert this class level text to uppercase here now if we run the code we will see this great looking bounding boxes but we can see that the backpack is not detected and this second bike is also not detected so let's choose another model that is more accurate than ssd mobile net let's pick this new state of the art efficient that family so d5 has just one percent improvement but the inference time is double so let's go with d4 instead copy the link let me comment this old url so that we can go back to it later and replace model url with this new link let's run the code again the model will be downloaded because it's not there in our cache and here we have the results now we can see that this backpack is picked up and the second bicycle is also picked up let's quickly run another model let's pick something from faster rcnn this one and paste the url here run the model and here we have object detection result for faster rcnn this one does not pick up the bicycle but its confidence is very high on all other classes that it has picked so efficient that picks up more objects accurately but has lower confidences faster rcnn picks less objects but whatever it picked up has higher confidence almost hundred percent ssd mobile net is somewhere in the middle so interesting results all right now let's run object detection on videos and webcams for that let's quickly define a method called predict video which takes video path as input captured the video if cannot open the video print error and return from the function otherwise read the image run a while loop if image capture is a success you know what let's define fps here also so start time would be 0 and current time can be taken from time module and fps would be equal to the one over difference of current time and start time start time would be equal to the current time let's call create bounding box function on the capture frame which will give us frame with bounding box and we need to put the fps on bounding box image in green color which would be 0 2 5 5 and 0. let's show this video frame and we also need to define a key press which will be used to break the while loop in case we want to quit the object detection on the video so a cue press will break the loop otherwise capture next frame and on quit destroy all cv2 windows so let's now define another variable called video path and i will use this street 1.mp4 video and call the predict video method with the video path and threshold value i want to run ssd mobilenet so let's uncomment that and comment all other urls here we have the results so interestingly it's giving us just 8 fps on gpu while we ran ssd mobile net on the same video using cpu with 18 fps but that was just using opencv so maybe tensorflow object detection is slower all right finally if you need to use webcam just replace video path with xero and that's it this is all for today's object detection tutorial using nothing but tensorflow and you can choose any model from these 40 models and play with the code i will see you next time [Music] [Music] you
Info
Channel: TheCodingBug
Views: 2,853
Rating: undefined out of 5
Keywords: object detection, tensorflow object detection, object detection windows 10 tensorflow, tensorflow object detection windows 10, tensorflow object detection linux, object detection tensorflow, object detection using tensorflow, object detection tutorial, tensorflow object detection tutorial, object detection with tensorflow, object detection tensorflow tutorial, object detection using tensorflow python, tensorflow, tensorflow object detection windows
Id: 2yQqg_mXuPQ
Channel Id: undefined
Length: 26min 59sec (1619 seconds)
Published: Thu Oct 21 2021
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.