Real Time (24-FPS) Object Detection using Nvidia's Jetson Nano

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
hey everyone welcome to my channel in this video we are going to learn real-time object detection using chetsundano and the good thing is that now the jetson nano is only 59 with the 2gb version so this means you can get up and running very quickly and all of this is done in real time which is amazing so you can see here this is the actual live feed from my webcam uh you can see it's a little bit blurry because i'm taking the feed directly from my webcam and i'm pointing it to a screen so that's why it's showing it like this and let me show you some other examples so here you can see we are detecting a person we have the images of elon musk and there you go the detection is very good and it is quite fast you can see we have about 20 something frames per second we have other categories like dog you can see here we can detect these images very easily again my screen is a little bit uh my feed is a little bit blurry because i'm pointing it at my screen but in real time the results will be even better if you do it with live images or with actual scenes then you can see here we are detecting cats then we have sofas uh the sofas they are not being detected as well but if we come here you can see we have a good scene and there is some good texture so you can see it's detecting very well and the interesting part is that it's trying to take these books as well which is very amazing because right now the image quality is really bad but still it is able to grasp a lot of that information so that is really amazing so we will be doing all of this in real time and we will learn how to do this this one is a interesting one because it is a couch but the person is trying to make it a bed and it's being detected as a bed and a couch at the same time which is quite funny if you think about it because it's true at the same time so we are going to learn how to do this and as you can see we have only a few lines of code uh here you can see this is what we will write but there is a twist to the story we have a back end we have created this object detection module that helps us run this program and it is also how how long is that less than 50 lines or you can say exactly 50 lines so exactly 50 lines of code and these are not difficult um to understand these are very simple aspects and the best thing is we will be using opencv image so whatever you have learned with opencv you can apply to these images as well so it is not a separate framework that you will not be able to use so this will be something that you can integrate with any of your projects and i have created a whole ecosystem with my new jetson nano premium course where we create different modules and we link them together so the same way we are going to create a module here and we can link it with any project that we want whether we are doing self-driving car whether we are doing uh tracking with an eye or whether we are doing face recognition or whatever we are doing whatever different projects we are doing we can integrate them together and in that course we have done three different projects face uh recognition uh self-driving car with lane detection and then we also did eye tracking for faces so we can integrate this neural network directly to that ecosystem without any issues and the best part is that you can take out one aspect for example if you were using hard cascades earlier to detect objects now we can simply remove that module and put this module inside to run it with our new methods so this way we can create a system we can create a platform that is very easy to work with so if you want to learn more about that check out my new course in the link in the description below and without further ado let's get started so this whole project is based on the jetson inference github repository provided by dusty envy who is basically an nvidia jetson nano developer so thanks to him and thanks to the nvidia team that have provided us with all of this code and this sdk that we can use to build our own projects so how can we use this now you can read all about this over here i will add this link uh in the source now what we will do is we will go through the steps of installation and then we will look at how we can use object detection so we will have three different files here so as you can see on the left hand side we have a bare minimum test so this will be absolutely bare minimum code that is required to run our object detection then we will create a module this module will be something that we can use in other projects so we don't have to rewrite the code again and again when we are creating multiple projects and then the last one will be our test code in which when we are creating a project we are going to see how we can use this module uh in a different project so in fact i should write here a project test so this way you can see how we can use it in a different project so what we will do is first of all we will go to the mobilenet ssd module here i have written down the steps that are absolutely required to run this so first of all we will go to our terminal and what i will do is i will split the screen so that we can view this simultaneously so now we will go step by step and run all our commands so this is the first one that we will run which is to update anything that is missing we are going to update this might take a while for you if you haven't update in a while but if you have it should not take as much so then in the second step we are going to make sure that we have cmake and we have the numpy so we are going to run this command now i already have these so it is not going to do much for me but if you haven't installed it already it is going to install it for you then is the third step which is to clone the repository that has all this interesting code so we are going to run that now we will go into the jetson inference folder so we are going to go inside that and then we are going to create a folder called build and then we will go inside that build folder and then we are going to run the cmake to configure the build so now it will ask us to download any models that we want so for this example we are going to skip all of the models except for the mobilenet ssd so you can download the rest of them as well and you can download all of them if you wish so the way you can select and unselect is by pressing the space the spacebar so here i will deselect all of these except for the ssd mobile leds version 2 which is 68 mb so i will remove all of these now later on you can open this up again so don't worry about this in fact i am going to add this to the code as well if you want to open it up again there's just two lines of code you have to write and it will open up again and you can download all of these models again so don't worry about that so here we have selected only the model number 14 and it is going to download that model now so what i found is that even though my internet is not this bad but it still takes a long time it's something with their server probably which does not allow high speed or something for me at least it doesn't work it takes a long time to download these files so now we have the option to install pytorch which will be used for retraining of models using transfer learning but since we are not going to do that we will skip it so as you can see we don't have a asterisk here so it means we are not installing it and we can simply press enter so now we will run the make command so now we are going to run the make install command so we will paste it here so now we will run the last command and that should finish up our setup and there we go so now we have completed our setup and we are going to go ahead and start coding we are going to close this terminal and we will open up our visual studio code so if you are not familiar how to install visual studio code you can go and check it on their website on visual studio code and i have also explained this in my course how you can install this okay so now we are going to look at the bare minimum code that is required to run our object detection so now the first thing we will do is to import our jetson inference so we are going to right here import jetson inference and once that is done we are going to import the utilities so these are the two things that will be pretty much required in any of these projects so did i write the spellings no of course i wrote it wrong you tell us okay now then we are going to import cv2 so usually what you can do is you can use just the jetson nano or the jetson inference to run your complete code now but the issue with that is that we will not be able to use any of the opencv functionalities that we are familiar with so what we are going to do is we are going to convert our image to um a format that opencv can understand and then we can play around with that as well so the first thing we have to do we have to create a network so we are going to call it net and we will call jetson inference dot detect net detect net and inside that we are going to write our our model so here we are going to write ssd dash mobile net version two now this has to be exactly the same spellings so where can you find this you can open up your file you can open up your files and there you will see jetsen inference you will open that up inside that you will have data inside that you have networks and inside that you can see here we have ssd mobile net version 2. so these are the files that are required to run this so these are all the labels you can read them as well there will be a total of 91 classes that you can detect so these are all these classes so we can close that so this is where we are getting this uh what do you call model from so the spellings have to be exactly the same then we have to write the threshold and the threshold is basically 0.5 for now and we can change it later if we want so then we are going to create our webcam so we will write here cap and we will write cv2 dot video capture and we will write the id number capture and we will write the id number zero so this will run our first camera that we have then we are going to write by the way you can also use the raspberry pi 2 camera again all of this i have explained in my course how you can integrate the raspberry pi version 2 camera as well to run this but now we are going to use just a simple webcam you can connect any webcam that you want so here we are going to change the size of this webcam so we are going to use 640x480 then we are going to create our while loop so we are going to write here while true we are going to get the success and the image from cap dot read so this will give us our image and what we can do is we can write here cv2 dot i am show and we want to show our image and we are going to write here img and we will write cb2 dot weight key to give it a delay of 1 milliseconds so this is pretty much uh good enough for our code so let me run this and see if we can open it up so i can right click and run python file in terminal or i can open up a terminal and write down the name of this file to run it so now it's going to run now when it runs for the first time it is going to take a while so it might take maybe two three minutes or even five minutes to run but after that it should be fine so now the initializations are done and it took quite a while but unfortunately again spellings are wrong so we need to change those capture but don't worry it will not take that long again it will only do it one time and now it should work without uh that big of a delay but there will be a delay a little bit but not as big as the first one so here we can see that the webcam is now displaying the image so that is good you can see here is my keyboard and it is smooth there are no issues so and that is even when i'm recording so that is good i'm recording the screen right now and running this at the same time so that is pretty good so we can press on the dustbin or the trash button to kill the terminal and now what we will do is we will go on to the interesting part so what is the interesting part it is to run our object detection so what we will do here is we will write here that our detections is equals to net dot detect and we are going to write here image and inside that we are going to write um should we write okay let's not write anything for now all we have to do is we have to write net dot detect okay so these detections will be stored here now the image that comes from opencv is based on the numpy library and the net.detect method actually does not recognize that so we have to convert it into its own format which is the cuda format so what we will do is we will write here that our image cuda is equals to jetson dot utilis dot cuda from numpy so we are going to write that and inside that we are going to write our image so what this will do is it will convert the image to image cuda and then we can send this image cuda to our detections now if you just write this it will run your inference and it will show you the output so let's try it out and see what happens okay i think i wrote the spellings wrong again so it will be you tell us without that okay let's run it again so now it should have detected and overlaid but it is not overlaying anything to our image and i will tell you why now it is running the the model at the back end but it is not displaying us anything so let me show you how we can display that on our image now there are two things here one method is that we convert this image because right now we are displaying our original image which is image and the the objects detected are available on the image cuda so one way is that we convert this image cuda back to our numpy format and the other method is that we just take the information from detections and we overlaid on our opencv image so this way we have more flexibility on what can we display and what we don't want to display so i prefer the second method where we can do it ourselves by writing uh our opencv code so here we are going to open up our detections so we are going to write here actually let's try the first method first and then i will show you how you can do the second one because the first method is a bit easier but it does not allow that much flexibility so we can write here jetson dot utilis uh this time i wrote the spellings right and then we are going to write uh cuda to numpy so earlier we did cuda from numpy this time around we are going to write cuda to numpy so we will write that and then we are going to send in our image cuda so this way the image that we are displaying at the end is basically that was processed so it was not the original image it was the processed image so this way we will be able to see all the detections uh that have been processed so let's try it out and there you go so now you can see that we have the percentage of how well it has detected what is the confidence level and what is the detection type as well so you can see here we have uh elon musk images and let's see if we have dual yeah so here you can see we have two people and it's still detecting it uh then we can try out cats so there you go you have cats and then you have the sofa that we saw earlier the couch so it's able to differentiate between a couch and sofa that is good what about dogs so it's able to okay it's thinking this is a cats but if we zoom in it thinks that it is dog so okay so this is basically how you can detect easily by their default parameters but the problem is that you cannot do much with this so even if you want to apply some more methodology on this image you won't be able to do that because it is processed within the sdk and it gives us the output that's it so what i want to do other than this i will keep this here in case you want to use it the other method is using the numpy or the opencv method so here we are going to write for d in detections now we are going to take out the information from this detection so we have to do that anyway because if we want to use it in any project we have to get the x y values or whatever we need so first of all let's have a look at what exactly are we getting when we detect something so we are going to detect the detection d for any of the detections that we get so let's run that and see what exactly are we getting okay so there you go so now it will not show us anything because we removed this part but it is showing us what exactly it is detecting so what i want to do is i want to copy this if somehow i can do that um maybe i can stop it from here yeah okay and then i can copy this part and i'm going to paste it here or yeah let's paste it up here oh it didn't copy let's try it again copy and paste there you go so these are the parameters that are within this object so we have a class id we have the confidence level we have left top right bottom width height area center so we have all these parameters that we can use now what we will do is the one bad thing is that these are decimal places so these are floating values or double whatever that is but in reality when we are talking with pixels it should be uh integers so we will convert it into integers that's the first thing and then the second thing that is a little bit bad is that it doesn't give us the name of the type that is detected it only gives us the id so we will have to convert it or we will have to find the corresponding name of the class so that being said let's see how we can do this so first of all we will go in our objects each individual object and we will get the value of x1 y1 x2 and y2 so these are the values that we will get that will be the bounding box around which our object lies so here we are going to convert each of them into integers and to get any of these values so if i want the left value i have to write dot left that's it and the the spelling and the capitalization has to be the same so l has to be capital and the rest has to be uh small so here i will write left where did it go okay here i would like left and then i will copy this a few times so three times and then i will write here uh this one will be the top and this one will be the right and this one will be the bottom so this will give us x one y one x two white now the second thing we need is our class name so we are going to write here class name so the good thing is that they have already given us a method to find the class name so that is available within the net so we will write here get class description and inside that we are going to say we are going to give in the id number so we can write here class id so this is the same format that's given here so we can get it like that so now we have the bounding box and then we also have the class name so all we have to do is we have to put them on our image so what we can do is we can write here cv2 dot rectangle rectangle rectangle and then we are going to write that we want to put it on our image and our first values are x1 and y1 and our second values are x2 and y2 and then we need the color to be let's say purple i always prefer purple for object detection i don't know why but there you go so we are going to give it a thickness of two and then we are going to put our text which will be the class name so we are going to write here put text and we will write our image we will write the class name it is already a string so we don't have to worry about any conversions and then we are going to put the name a little bit down um or you can say inside the box so we will say x1 plus 15 or x should be 5 it's not that big and then we have to write y one plus fifteen so it will go a little bit down and then we have to choose the font so let's say the font is hershey duplex and then we have what do we have next uh i think we have the scale and then we have the color 255 again we will give the purple color and then we have the thickness we will give it two so all of this is now good and let's try it out and see if it works so we can remove this print d for now and let's run it and see what happens and there you go so now we are able to use it using our opencv image and we can plot whatever we want so here you can see we are detecting dog we are detecting cats we are detecting the couch and the books so all of that is good and now we are doing all of this within our opencv framework so this is the bare minimum code that is required to run this now the next thing we will do is we will convert this into a module so this module will help us to use this code in many different applications or projects without rewriting it again and again so i was thinking about whether to do this using functional programming or object oriented programming so i was leaning towards functional but when i thought more about it i think it's better to do it with object oriented so we are going to create a class and then using this class we are going to create an object and then run our inference so how can we do that we will go to our bare minimum code and we will copy all of it and then we will go to our module and we are going to paste it so that is good and we are not going to use this part so we can remove it or we can keep it up it's up to you now the first thing we have to do when we are creating a module is to create or to write the if statement so we are going to write here if underscore underscore name underscore underscore is equals to main then we are going to run this script or this module so if we are running it we will if we are running it by itself we will run this otherwise we will have a class or a function that you can call so that is the main idea so within this main we are going to write everything we want so we are going to write a main function and inside that we are going to put our while loop so let's actually cut all of this and we will put it in the main okay so the next step is to create our class so right now we are not going to change anything here we will just create our class so actually all of this also needs to be inside the main so now everything is in main we are going to remove and add stuff there but for now we are going to create our class so our class let's call it mobilenet ssd so this is our class so we are going to initialize it first inits and then we are going to write self so if you are not familiar with object oriented programming with python you always write self when you're creating a new function or a method whatever you want to call it so whenever you do that you have to write self first and then you write whatever input arguments you need so we will need so we will need the path which is this and then the threshold value so we will write here path and then the thresh hold and once that is done we need to define what is the value of it so we will define here self dot path is equals to our path and then self dot dot threshold is equals to our threshold so what this means is that when we create an object that object will have a path so we can create 10 different objects and they can have 10 different paths so when we create an instance this is that instance variable so we are talking about the self path which means the path that is for that particular object not a journal path so we are assigning the path of that object to be the path that was inputted and the threshold of that object is the threshold that was given by the user so this is what this means and then we need to define this so we will cut it from here and we will paste it here so we need to write here that self.net is equals to so basically if you're confused just keep writing self in front of uh any variable to make it an instant variable so that's pretty much it so then we will write judson inference detect net and here instead of the path we are going to write here self.path we will write here self.path and here we will write self taught threshold so we can given our threshold here okay so this is good for now then the next thing we need is a method to detect so we will write here define detect detect and inside that again we have to write self and then we need an image so we will call it image and after the image we can add a parameter to display so this will be a flag that if you want to display your values or not so we can keep it by default as false and if we want to display it then we can display okay so then here we are going to write the conversion so what did we need so to detect first of all once we have the image we need to convert it into image cuda so we are going to remove it from there and we will paste it here so now this image will be converted to image cuda and then we need to find our detections so we will remove it from here and we will paste it here so this will be our detection but here we are not using net we are using selfnet so we will write here self.net and we will use our image cuda now one more thing that we have to do is right now when you run this it will automatically overlay their own boxes around it so whatever the sdk is telling it it will overlay that boxes but we don't want to do that because we are doing it ourselves using the opencv functionality so we are going to say that overlay is equals to none so we will say overlay underscore none we don't want to add any boxes on it we will do it ourselves okay i removed the image cuda by mistake so we need to put it back image cuda here okay so now that we have our detections then we can loop through those but at the end of the day we are not doing this just to display we are doing this so that we get the information where exactly is the object present so that we can use it in different scenarios for example follow the object move forward move backwards we are telling the robot to do this stuff so we need that x and y position we need all this information so what we will do is we will create a new list and we will call it objects and inside this list we are going to put all the information that is required and we are going to return return this objects list so this is our main goal so while we are returning this we might add what do you call the bounding boxes to the image if we want so that's the idea okay so here we are going to write the for loop so we will cut it from here and we will paste it here there you go so we are going to write it here that we are detecting all the objects but then we have to put it in our objects and the reason we are creating a new list so that we can put it in objects is because the the list that we are getting from detections does not have the class name we want the class name as well so we are going to first of all find the class name and after that we are going to put everything in the objects we are going to say objects dot append append and inside that append we are going to write class name and then the information that we are getting so it will have all the information that we got over here the left the right the top the confidence the id the area the center everything will be available within this second element where did it go here within the second element everything is available except for the name so that's why we are writing the name separate and all of that is separate so now we are ready to send this out so now we can send the objects and it is all fine but if we want to display then we can add the option here that if we want to display if this play is true this is the flag that we got from here if that is true then we will do all of this otherwise we don't need to so we can put a lot of different things here one special thing we can add is the frame rate as well to see if we are using uh what what speed are we getting so we can do that so let's do that first we got we are going to copy that and we will paste it here or should we test let's test it first before we go ahead and add more code let's see if what we have done so far actually works so now you will see that the main function is almost empty and compared to what it was before so all we have to do now is we have to create an object so we can create an object by calling this so we will call it my model is equals to mnssd and inside that we are going to give in our path and our threshold so what was the path and threshold we will go back we will copy it because we are lazy and here we are going to paste it so this is our path and this is our threshold we don't need to write the threshold before that so this is our threshold and we can remove one bracket so now our model is created and all we have to do is we have to say that objects is equals to my my model and my model dot detect and we have to just give in our image and that's it so it will detect the model for us and if we want to display we can write here true so it will display it for us as well so let's run this and see if it works so there is an error syntax error uh detects i didn't put the column okay then put the call in here let's run it again okay net is not defined uh of course there will be some mistakes here and there uh the net i think there will it won't be defined in a lot of places no only here so we need to write self.net so as i mentioned before when we are using it inside our class we have to use self.net did i make that mistake anywhere else let me go through again it doesn't seem so let's run it again and there you go so now our module is working as well so now we can use it in different projects but before we go and end this let's add a little bits awesomeness to the detection part so the way we displayed so again we are going to remove this and instead of class name we are going to write here f and then inside that we are going to write fbs and we will put the value of our fps inside so how can we do that we have this method called get network fps within our net so we can write here self.net and inside that we have get get uh network fps uh i think all of them are capital fps and then uh we need to convert it into integer so that it is not in decimal places so we can do that and what else so we need to give in the initial position so we will give it 30 30 and then what else um the scale we can give as one and we can change the color to let's say blue and what else do we need anything else no i think that should be fine so let's run this so here we can see now we can see the fps so here we are getting around 20 fps which is very decent it's almost real time so there you go so that is good and now we can move on and we can add some more uh what do you call zing to it so what we will do is we will create two lines and i like to present it this way so we will write cv2 dot line and we are going to create a line starting from the actually let's write the image first and then we are going to create a line starting from the x1 and then we will give in the height as cy and then we have did we define the c y no we didn't so we can define here c y is basically c x and cy are the center values so we can simply write integer d dot center center at zero and then for the second one we can write integer d dot center at one so if you are wondering why i am doing zero and one then if you go back to your objects you can see here that within the center we have two values so we can extract those values by using this method so here we will write 0 and here we will write 1 and yeah that should be fine and then we will write x2 and then y2 that should be good and then we are going to use the same purple color and let's write thickness as one and then we can copy this line so this will be the horizontal line and now we are going to create a vertical line so this time our cx will be fixed here c x will be fixed and what is moving is c y so the sorry the y value so here we will write y one and here we will write y two so this is the idea and here i have made a mistake it should be cy so here cy is fixed and here cx is fixed and this is the moving value okay so then yeah i think this should be fine let's try it out and we can also draw a circle in the middle so let's do that so we can write here cv2 dot circle circle and then inside that we are going to write our image and then cx and cy and then we have to give the radius let's say 5 and then we have to give in the color two five five zero and two five five and then we have to give in the thickness so here we will write uh cv2 dot filled so we want it to be filled uh yeah i think that should be enough so let's run it and see what happens so there you go now you can see that we have these uh center point and then we have these lines as well that actually look pretty good and we also have the fps at the top so if we want to remove all of this which is maybe putting a bit of overhead on your computational power then what you can do is you can simply go to your function here your method here where you called and you can put it here as false or you can not do anything because by default it will not do it will not display because we have written here by default where is it by default it's false so if you want to read all the values then you can of course write here for example prints let's say the objects and let's say we want to print the only the first object and we want to print its name so we will write zero so if we write that it will print us the name of the first object into text so now we have all the information that we need and we can use that information in different projects so let's run this and see what it prints okay so you can see that it gave us an error because the list is out of range which means it didn't detect anything so what we can do is we can write here if the length of objects is not equals to zero then you print otherwise don't print so this error should not come again so there you go so now even though we are not displaying anything it is detecting the person and if i go to let's say the dog you can see it's detecting the dog and then if i go to the cat it's detecting the cat and if i go to the couch clicking the couch and of course you can play around with these values so this is the idea and if i block my camera you can see nothing is detected but it still doesn't give an error because we gave that statement so this is basically how you can create a module and now you might be thinking okay we have done the module but how can i use it uh in a different project what should i do so the idea is very simple let's say you have created your project and the code is in this project test so you have written lots of code here now you want to add this functionality of object detection so what can you do so you will go to your object detection module and in the main you are going to copy all of this code so this is like a sample code given if you run the module itself so you will copy all of this and you will go to your project and you will paste it here but now the issue is let's go back okay but now the issue is that it will not recognize this mn ssd so how can you run this you have to import here first of all cv2 and then you have to import mobilenet so this is the name of your module so you have to write this name so you have to write import mobile net ssd module as because it's a very long name we will shorten it out we will call it mn ssdm so this is our mobilenet module so we can copy that and wherever we are declaring our object we have to put in first this and then dot and then um that's pretty much it you can run this now and it should run without an issue so it will run exactly the same way the only thing you have to make sure is that this mobile net ssd module is in the same folder as the project folder otherwise you will have to give it an external reference here but we are not going to do that so let's run it here and let's see if it works actually before we run it let's run it with the detection as true so we want to display the detections so let's run that and let's run it again and there you go so now you can see we are using it in a different project we have imported the functionality of it and we have created our own object and now we can detect all these uh objects that are within our class within our model so there you go so all of this has been detected properly so again if you zoom in because it's a little bit blurry if you zoom in it will be able to detect these as well so this is basically how easily you can use object detection now that we have created this module all you have to do is you have to write this line of code and then the second line of code to detect these objects that's it two lines of code and you have real-time object detection to me that is amazing and all of this is done within the jetson nano which is a very cheap computer that is very very powerful to do all of these ai projects so i highly recommend that you do check it out and if you are doing this in raspberry pi it will take you some external gpu to actually get to this point but with jetson nano you can do it directly within it and that is why it is very good to use and now it is only available for 59 opposed to 99 with the 4gb model so now they have a 2gb model as well so you can run these in both of the 4gb and the 2gb as well and do check out my course on the jetson nano which dives deep into how to create modules how to create different projects around this ecosystem of modules and how easy can it be to create different projects and different prototypes in very quick time so this is it for today i hope you have learned something new if you like the video give it a thumbs up don't forget to subscribe and share it with your friends if you find it useful and i will see you in the next one
Info
Channel: Murtaza's Workshop - Robotics and AI
Views: 33,836
Rating: undefined out of 5
Keywords: object detection opencv python, object detection opencv, object detection python, fast object detection, opencv python, mobilenet ssd, opencv mobilenet ssd, ssd object detector, object detector mobilenet ssd, ssd mobilenet, deep learning opencv, dnn cv2, cv2 object detection, computer vision object detection, real time object detection, object detection 59 dollar, object detection jetson nano, jetson nano mobilenet ssd, mobile net ssd opencv, opencv jetson nano
Id: mB025B7KpeE
Channel Id: undefined
Length: 52min 8sec (3128 seconds)
Published: Fri Mar 19 2021
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.