Latest Pose Estimation Realtime (24 FPS) using CPU | Computer Vision | OpenCV Python 2021

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
hey everyone welcome to my channel in this video we will learn pose estimation we will detect 33 different landmarks within a human body and all of this will be done in real time that's right more than 24 frames per second and only running cpu we will first look into the basic code required and then we will create a module out of this so that we don't have to write the code again and again and yes we will be creating a lot of projects with this so don't forget to subscribe and hit that like button if you like to create real world computer vision apps do check out my premium course in which we learn how to create apps such as object detection augmented reality document scanner and many more the link will be in the description below so without further ado let's get started so here we are in our pycharm project and you can see that we have named it pose estimation project and we have a folder here with post videos so if i open this up you can see that we have a lot of different types of videos we have a total of nine videos so let me play a few so here you can see this one is a little bit smaller and some of these videos are actually slow motion so when we are running it it will look like it's slow but it's not actually slow it's actually slow motion so like this video and then i think well i think this video as well it's slow motion and i think this one is fine yeah this one is normal this one is slo-mo as well so i took these videos i think from pixels.com so we are going to test these videos out and see how well does our pose estimation work so we will select all of this or let's let's create a new file let's delete this one so let's delete this and now we are going to create a new file and we will write here pose as the mation uh minimum so this is the bare minimum code that is required to run it and later on we are going to see how we can create a module out of this so that we don't have to write the code again and again for a lot of different projects so let's start by importing cv2 and then importing media pipe as mp but now you can see that we get an error this is because we did not include these packages in our project so how can we include those so we will go to file settings and we will go to our project interpreter and open that up and here we will write opencv dash python and we will hit install and then we will write media pipe and we will hit install so now both the packages have been installed and we can go back and you can see the error is now gone so opencv is the library that we will be using for image processing and media pipe is the framework that will allow us to get our post estimation so now the first thing we will do we will read our video so we will write here cap is equals to cv2 dot video capture and we will simply give in our video number so here we will write pose videos videos and we will write video number one dot mp4 i think the one is quite big let's let's try it out and later we can change it if we want and then we will write while true and we will write success and image is equals to cap dot read so that will give it uh our image and then we can write here cv2 dot im show and we will write here image image and then we will write here image and then we will write cb2 dot weight key and we will write one so that we get a one millisecond delay okay so let's run this and see if it works and there we have it so our video went quite fast so the frame rate is actually quite high at this point what we can do is we can check the frame rate by writing here time let's say current time is equals to uh we need to import time as well so we'll write here import time and then we will write time dot time then we will write that our fps is equals to 1 divided by our current time minus our previous time and our previous time is equals to current time and we need to define the previous time up here at the top so we write here previous time is equals to zero and then we can simply put our text so we will write here put cv2 dot put text and we will write in our image and then we will write in the text itself so we will write string fps but we will convert it into an integer before we do that then after that we have our origin so let's say 70 and 50 and let's say the cv2 dot font plane and let's say uh three and then for the color we can put two five five zero and two five and zero and then we can put three so this should give us our frame rate let's try that out uh wait what happened okay this should be below over here my bad so let's run it again and there we have it so you can see it's a hundred something frames per second so that's quite a lot if you want to reduce it we can put here let's say 10. so now it's like 50 60 frames per second but when we are using our model it will automatically slow it down so we don't need to worry about that so the next step would be to create our model so our object so that we can detect our pose so here we are going to write mp pose pose is equals to mp.solution solutions dot pose so we are going to use this and then we are going to create our object we are going to say pose is equals to mp pose and then we will give in our parameters so if we go to the parameters you can see that we have the static image mode this is basically that when you are detecting and when you are tracking so if you put it as true then it will always detect based on the model it will always try to find the new detections but when you put it as false it will try to detect and when the confidence is high it will keep tracking so there will be a tracking confidence and then there will be a detection confidence so whenever it detects if the confidence is more than 0.5 it will say okay now we have detected now i will go to tracking now the tracking will check if the tracking confidence is more than 0.5 it will keep tracking whenever it goes below 0.5 then it will come back to detection so this way we do not use the heavy model again and again for detection instead we use detection then tracking then whenever it's lost we use the detection again so this is what this does and then we have the upper body only so you can decide if you want to detect only the upper part so it will have you can see here we have 33 poses landmarks or only 25 so it's up to you which one do you want to use and here we also have a feature to smooth uh which is by default true so we will keep it as true so we can define all of these parameters or we can skip them it's up to us so for the initial purposes we are going to skip all of these for the simplicity so then we are going to go down here and simply we are going to convert our image so we will say image rgb is equals to cv2 dot cvt color so this image is in bgr but this library or this framework uses rgb so to make it compatible we have to convert it so we will write here image and then we will write cv2 dot uh cv2 dot what color underscore bgr to rgb so this is our conversion and once we have done the conversion we are simply going to send this image to our model so we will write here results equals to pose dot process and we will write image rgb so that's pretty much it so that will give us our detection of our pose but it will not draw anything but for now we can just run it to see if everything is working and now you can see that the frame rate has decreased so this is good to see that our model is actually working but you can see that it's almost uh real time so that is really good so now what we can do is we can draw our landmarks or whatever we have detected we can draw it but before we do that we can print the results as well so we can write here results and let's see what do we get so here you can see we get nothing so it's just a class but we don't see any information so how do we see the information we simply write results dot pose landmarks land marks so if we run that now you will see that we are getting actual landmarks so each landmark will have the x y and z value and then it will also have a visibility value so how visible is it so this is the information that we get so we can put all of this in a list later on so that it's easier to access so then we will check if it is detected or not so we will say that if this is present if this is true then we are going to write here that mp draw dot draw landmarks draw underscore landmarks then we are going to define our parameters but here you can see we don't have anything called mp draw so we are going to declare that we will write here mp draw draw is equals to mp.solutions solutions.draw utilities so we will use that and then we can write here uh within our landmarks we can send in our image then we can write results dot uh pose landmark so this is the same thing that we are printing out that we got so we will write here land marks and then what do we have uh i think that is good let's do that let's run it and there you go so now you can see we are getting all the points you can see that it is in red now what we can do is we can also add the connections or the lines between these so here we can write mp dot not mp pose dot pose connections so that will fill up the connections there you go so now you can see we have the green lines which are the connections and then we have the points as well and we are detecting them all at real time so that is very very amazing so now we know that we are getting this information but how can we know which is for which so here it just says landmark this landmark this there is no indication of which landmark represents what so what we need to do is we need to organize a little bit so that it is in a list and we can simply use these values um in our project so for example i will say i want landmark number five i want landmark number three so if we go to the media pipe website you can see here that these are the landmarks that they have given so for example if i want the right ear so i can just say i want the element number five element number eight so give me that element if i want the nose i can say give me the element number zero so based on this it will become very easy for us to actually uh create our new projects so gesture recognition and lots of different applications will become very very easy so how can we extract the information within this object so what we can do is we can write here for id and landmark in enumerate we will write this result so we are going to loop through this and we want the count as well that's why we have written enumerate so it will give us the loop count over here so 0 1 2 3 and so on then we are going to get the shape uh actually this will be uh we have to add here landmarks landmark and then we are going to write here height width and channel is equals to img dot shape and the reason we need this shape is because uh let me actually show you why and you will understand so what we can do is we can write here prints and we can write lm dot x or let's just print lm so you can see what is happening and we can print the id number as well so you know what exactly are we extracting here so let's run this so there you go so now you can see we have the id number and this is the information of the landmarks so 30 is this uh 21 is this and it goes all the way to wait why is this showing i think i didn't remove the other print anyways so you can see here from 0 to 32 we will have all these 33 landmarks so let's go down and let's just remove the previous print first okay so now we know that we have the id and we have the landmark but you can see the landmarks they are actually in decimal places so this is basically a ratio of the image so what we can do is to get the actual pixel value we can say that l m dot x this will give us this x value multiplied by the width of the image so that will give us the x of our what do you call the point or the landmark so it will give us the exact pixel value the same thing we can do for lm dots y and we can multiply it with the height so we can do that and then what we can do is we can put them in our cx and cy variables so this way it will be easy for us to use the indentation is wrong that is why it's giving an error okay so also we need to convert this into integer because we have to make sure it's not a float or a double because we are talking about pixel values so then what we can do is to confirm that this is happening and we are getting the correct values we can simply print the circle uh on top of this point so we can write here cv2 dot circle and in the circle we will write image and then we will write cx and cy and then we can write let's say the value of 15 15 will be too big let's say 10 then 2 5 5 0 two five five uh this is purple let's make it blue and then we will write cv2 dot filled so this will overlay on the previous points if we are detecting it properly so let's try it out so there you go so now you can see we have those blue dots and these are the ones that we put ourselves so this means that we are getting the correct information at the correct pixel values and it is working good we can reduce this to five there you go much better actually let's try another video we have been using the same video we have like nine videos so let's try a random one number five yeah that looks good let's try number six yep that is good number three there you go okay here we had an issue but now it's fine that's good so we can see that we are getting good results and that should be good enough for us so next what we can do is we can convert this into a module so that we can use these values very easily so the first thing we will do we will go here and right click and we will create our module let's call it pose module and we will copy all of the bare minimum code and we will paste it here so that looks good now the first thing to make it a module we have to write here if underscore underscore name is equals to underscore underscore main then we are going to write main so what this does is that if we are running this by itself then it will run the main function and if we are just calling another function it will not run this part so this is what it means so we will write here main and within the main we are going to write everything or we will write the dummy code in the main so whenever you want to see what a module is capable of or the testing script you can put it in the main function so here we will we will take this till here we will cancel it from here we will cut it and we will paste it in the main then we will take all of this as well we will cut it and paste it in the main again so that should be good and what else so now we are going to create a class so in that class what we should be able to do is we should be able to create objects and we should be able to have methods that will allow us to detect the pose and find all these points for us okay so here we are going to say that our class let's say is pose detector and inside that we will have our first method which is the init this is the initialization so we will write here in itself and then we are going to write in the parameters that are required so whatever parameters we are expecting we will write here so here we are going to write mode so if i go to where is it okay let's go to pause estimation and here we are going to go to pose so i will copy all these parameters and we will just paste them here and we will write so the first one is mode so we will keep it as false so that we get fast detections uh fast detection plus tracking and then we will write here upper body so upper body is equals to false and then we have the smoothness smooth is equals to true and then we have the detection confidence is equals to 0.5 and then we have the tracking confidence is equals to 0.5 so these will be our initial parameters and then we can remove this and then we can write here that self self dot mode is equals to mode now if you are not familiar with object oriented programming then this basically means that whenever we create a new object it will have its own variables so this whenever you write self dot something it is the variable of that object so whenever we are using um a variable within our class we will uh write self dot something so self dot objects uh sorry self.mode self.upbody so we are going to write that so here we are saying that the specific variable of our class or our object is basically the one that the user has provided so it will set this instance of that object to false so this is what it does so here we will copy all of these and we will write up body is equals to up body and then we will write smooth is equals to smooth and then we will write detection confidence is equals to detection confidence and we will write track confidence is equals to track confidence so as i mentioned before we just have to put self dot in front of each one of these and that should be good to go so next we also have to declare these so again this will be part of that object so we need to write self so here we will write self.draw self.impose and self.pose and we will write here self.impose so this is good and what we can do is earlier we were not using any of these parameters but now we have to so here we are going to send in all the parameters so we will write here self dot mode so self dot body self dot smooth self thought detection confidence and self. tracking confidence so that is pretty much it okay we need to put a comma here and yeah so our initialization is pretty much done and now what we can do is we can create a method to find the pose so we can write here find pose and we do have to write self here so whenever we are creating a new method we have to write cell first and then we have to give in our image and then we will also have a flag called draw and we will put it as true so this basically what it will do is it will ask the user do you want to draw or not do you want to display it on the image or not so we want to we want to display so we are going to put this inside here and then uh this draw if results draw this this this yeah this will be separate this for loop will be separate and this results will be here so let's push this in and we can remove this and what else so now we need to write the self dot so here we will write self dot pose process and self dot mp draw then we can also write self dot self.mppose so anything missing uh no it seems fine now what we need to do is we need to put that flag so we will write here if draw then we basically do this so that should be good so let's try it out or should we put it inside let's put it inside that so we will say that if landmarks are present or let's put the draw inside so if the landmarks are detected and we set draw then you need to draw so that is good so by now we should have a working class so we can create an object from it and then we can run it so let's try that out so here we will write our detector detector is equals to our uh pose detector and we will give in the default parameters and here what is happening here this should be inside the while loop so here we are going to write detector dots find what was it find or detect find pose find pause why is it not detecting oh the indentation is wrong so now it will detect okay so detector dots find pose and then we have to give in our image so that should be it and we need to return the image so we will write here return image and we can bring in the image back here so if we run this it should draw on our video so let's run the pose module and there you go so now it is running and it looks good let's try it on the first image there you go so this is quite good and what we can do next is we can do the main part which is to find the points so we need to get those points here we are going to define the get position and we will write inside that we want an image and then again we want the draw flag so by default we are going to yeah let's put it as true and then we can uncomment this okay so now what we will do is we will first of all here we are getting an error for results because we need to write self dot results here and here and here and here okay so that should be good and what else so we need to push this in in the for loop so now we are we are just looping we need to check first as well if the results are available so if the results are available then we will use this for loop and then we need to put it in a list so we are going to write a list lm list which is for our landmarks and we will append our values so here it depends on you what kind of values do you want to append so i'm going to append only the x and y values and the id if you want to append the z and the what was the other one the visibility you can do that too so here i will write lmlist dot append and i will write in the id so here i will write id and then i will write the cx and the cy and i will add the option here that if draw then do this so this looks pretty good so let's go down and try it out so where is it here so we can write here that our lm list is equals to detector dot find position is it find position or get position let's keep it find position so it's similar to the previous one find position find position and then we will give in our image and let's keep it as true for drawing so we know it is detecting and we can print out the list as well so we can print out lm list and there you go so now you can see we are wait why is it printing this that is not right uh okay this is printing this part any other prints uh no so let's write again yeah so right now it's none wait why is it none the lm list is none lm list oh we didn't return it forgot to return return lm list okay so now we can run it and there you go so now we have the list and if we go till the end you will see we have a total of 32 points so we have 33 points because we are starting from zero so now i can easily say for example i want number zero i want number eight i want number five number twenty three whatever i want i can write down here so let's say we want number eights or let's say we want number fourteen let's try that number fourteen there we have it so this is the number fourteen and if i wanted to i could draw this number 14 so i can i can put this as draw false and then here i can write lm14 at one and two so this will be one and this will be two and i can make this really big so i can see what exactly i'm am i tracking and i can change the color of it so that it is a little bit different than all of these so i can put it as red and there you go so now you can see we are tracking this elbow so this is how easy it can be by the way whenever it finishes the video it will give you an error you can loop through that there's a function for that as well but we're not going to do that so this is the basic idea of how you can convert this into a module and now you can use this in any project that you want and you can easily find the position so how exactly can you do that so let's create a new file and we will call this our awesome pose project so we will copy the parts from the main so as i said this is the testing code so we will copy all of that and we will paste it in our awesome project and now we have to import so we will write here import cv2 and then we have to import time and then we have to import our module so our module needs to be in the same folder if it's not in the same folder it will not work so you have to write import pose module as pm because it's a smaller name so we can write here pm dot post detector so this will create the detector and everything else will run the same way so if we run this now our awesome projects and you can see it runs exactly the same way so now i can use this in many different applications so one thing we can do is we can test some other videos so let's say number two there you go again we are detecting the right elbow and you can see it's tracking really well that's pretty awesome even when it's a little bit hidden it's still tracking it that's really good and then let's try number three what happened to number three there is an error list index out of frame okay so yeah this is a problem because we are not checking uh if the list if actually the list is filled or not so we need to check if lm list is not equals to zero actually the length of it is not equal to zero if the length of this is not equals to zero then only we can do these two things so actually we should change that at the main module as well so where is it here so let's run the awesome project again and this time it works so probably the first frame it was not able to detect that's why it was giving this error so now you can see it's detecting that elbow let's try number four there you go detecting the right elbow very nicely let's try number five there we have it excellent the frame rate is amazing so you're getting real time number six pretty good and the best part is it's running on cpu it's not actually using gpu to run this so that is very good then we have number eights let's try that okay that looks good and then we have number nine so yeah that is pretty good so there are a few instances where it gets it wrong but overall you can see it's really good so this is basically how you can detect pose in an image or in a video and you can do the same thing on a webcam i'm not doing this on a webcam because it will be very hard for me to test that again and again i will have to stand up and go back and be in the frame to get it up and running so uh that's why i'm using these videos so you can see that it is fairly good and all you have to do now is use this pose module and all these files will be available all of these three files the minimum code the module and the awesome project all of these will be available on my website and a lot of you ask me how do you access the code the code is showing grey text it's not available well you just have to log in and you have to register on the website and then you can pretty much access uh most of the codes uh by just enrolling in the course so uh most of my courses i think 95 of my courses are free so you can just log in and click on enroll and then you can run it so this is it for today's video i hope you have learned something new i hope you can use this in your uh amazing projects and i would love to see some of your projects what you come up with if you do make a few of these projects do tag me on facebook or instagram i would love to see them and if you like the video give it a thumbs up if you really loved it share the video with your friends and if you haven't subscribed yet do subscribe and i will see you in the next one
Info
Channel: Murtaza's Workshop - Robotics and AI
Views: 72,344
Rating: undefined out of 5
Keywords:
Id: brwgBf6VB0I
Channel Id: undefined
Length: 41min 2sec (2462 seconds)
Published: Tue Apr 06 2021
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.