Detect 468 Face Landmarks in Real-time | OpenCV Python | Computer Vision

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
hey everyone welcome to my channel in this video we are going to learn how to detect 468 different landmarks on faces we will use the model provided by google that runs in real time on cpu and on mobile devices if you would like to create real-world computer vision apps do check out my premium course in which we learn how to create apps such as object detection augmented reality document scanner and a lot more the link is in the description below so without further ado let's get started so here we are in our pycharm project and we have created a new project called face mesh project now we have a special folder here by the name videos and if you go inside you will see that we have a couple of videos in fact more than a couple you can see here that we have a total of eight videos and each of these videos they have a different size so they are not actually the same width and height you can see this one is smaller so the frame rate might vary based on the size so let's try this one yeah this one is quite i think this one is hd or even full hd probably so yeah so these are some of the videos that we are going to test on so what we will do is we will right click and we will create a new python file and we are going to call it face mesh basics so what we will do is we will learn the basics and once we know the basics we are going to create a module out of it so that we don't have to write it again and again for different projects so the first thing will be to install our libraries we will go to file settings and the project interpreter and we will add the opencv python and we will also add media pipe which is the library provided by google that we will be using to detect all these different 400 plus points on the faces so one of them is already done the other one is also done great so now we are going to import cv2 and then we are going to import media pipe as mp and then we will also import time to see the frame rate so as always the first thing we will do we will run our video and we will use cap is equals to cv2 dot video capture and we are going to give in our video path so here it is in videos videos slash one dot mp4 let's use the second one because the second one has two faces in it so that will be better when we are testing okay so then we are going to write while true we are going to say uh success and image is equals to cap dot read we are going to read our image and then we will say cv2 dot i am show we will say that it is an image img and then cv2 dot wait key as one so that is pretty much it so let's run it and see if it works and there you go so you can see the video is now running and we have two faces uh the first one i believe has only one yeah and it is quite big so it is i think hd i mean full hd so let's try let's work on number two and later on once we are done we can try number one number three four five and so on okay so the next thing we will do is to write the frame rate so here we are going to write c time is equals to time dot time and then we are going to write fbs is equals to 1 divided by c time minus p time so p stands for previous c stands for current time so then we will write that our previous time is equals to current time and up here we are going to define the previous time is equals to zero so that should give us the frame rate and then all we need to do is we need to put it on our image so we will write put text on the image and we are going to write fps and in the curly brackets we are going to write integer of fps and yeah that should be it and then we have to give in the location so 2070 and then we will give in the font cb2 dot font let's give it a plain font and what else then we will give the scale then the color we will keep it as green and then we will give in the thickness so yeah that should be enough so let's try that out and there we have it so now we are getting the frame rate it is quite high at the moment okay so that is pretty good we can bring this down now the next step would be to use our media pipe library to actually find the points the different points on the face so what do we need first of all we are going to write here mp draw is equals to mp.solutions solutions dot drawing utilities so this will help us draw on our faces now we could draw ourselves as well but the thing is that when they are using their own function they actually make some lines in between some connections between these points and that is quite complicated so rather than doing it manually we could use their function for displaying purposes but if you just want to see the points you can just draw circles by yourself as well i will show you how to do that as well okay then we are going to write mp face mesh is equals to face mesh is equals to mp dot solutions dot face underscore mesh so we will be using uh this to actually create our face mesh so we will write here face mesh is equals to mp face mesh dot face mesh i know it's quite a bit of repetition but this is what we need to actually create our object and from where we can actually find our faces so then inside this if i press the control button and i click on this it will take me to the function itself to the class face mesh and here it will tell me that what is the what are the input arguments so here we can see its static image mode is false then we have the total number of faces then minimum detection confidence and the minimum tracking confidence so the static image is whether you are using it only for detection or you are using detection and tracking so if it is a static image mode it will always detect in each and every single image but if it is false then it will detect and then it will track so detection is always heavier than tracking so therefore we will detect first whatever the confidence is above 0.5 which means that it has found the probability of 50 for a face then we are going to detect and then we are going to keep tracking that face if the tracking confidence is higher than 50 as well so we can change these parameters as well but for now we are not going to do anything here uh actually we can change the maximum number of faces because uh we want to detect two faces so here we can write max number of faces is equals to two uh the rest we can keep it as it is so then we are going to go down here and this actually accepts this class actually accepts only an rgb image so we have to convert it so here this image is bgr so we will convert it so we'll write image rgb is equals to cv2 dot cvt color and then we will write image cb 2 dot color underscore bg r to rgb so that is the idea and then we can simply write results is equals to face mesh dot process and we are going to send in our rgb image so that is the idea and now what you will see is that the frame rate has dramatically reduced there you go so now we are getting right 50s to 60s which is actually quite high considering that i'm only using cpu but if we do it on the first one it will be slower because that is full hd video yeah so it's around 40s and if we are doing it on an hd video then it is around 60s so that is pretty amazing okay so that is good and now we are getting some results but now we need to display them so to display them we are going to write if results dot multi-face multi underscore face underscore landmarks then if something is detected then we are going to go ahead and draw but the thing is that you can have multiple faces here so you need to loop through the faces before we actually draw so to do that we will write here for face landmarks let's say face landmarks in results dot multi multi underscore face underscore landmarks we are going to go ahead and loop through that and for that we are going to now draw we will write here mp draw dot draw landmarks and then we are going to give in our parameters so if we go over here you can see that you have the image you have the landmark list you have the connections so here first of all we will give in our image then we have the landmarks which is basically your face lms these are the face landmarks and then the connections so we are going to write uh mp face mesh dot face connections so let's try that out let's move it a little bit here okay let's try that out and there you go so as you can see now we are getting the faces and both the faces are being detected and if you go back and you write here one you will see that only one face is detected there you go and you can see how fast and accurate that is which is pretty amazing to see okay so what if you want to change the size of let's say the thickness of the circles or the thickness of the lines around it so what what if you want to do that well to do that you can write here some specifications so you can write here draw no not mp draw draw specs is equals to mp draw dot drawing specifications and there you can write that my thickness is equals to 1 and my circle radius is equals to 2. let's see because i'm changing right now i think initially it is 1 1 or 2 2 something like that so we are changing it a little bit so that we can see an effect so once we do that then we will go to our mp draw over here and we are going to write our specifications here so we will say that landmark landmark uh drawing spec i think it is the next one so we can directly write yeah so these are the next two parameters so we can directly write them so we can write here that it is draw specs and then again draw specs so let's run that and there you go so now you can see it has changed and um let's say i increase the radius like dramatically let's put it five so there you go so now you can see all these weird points and let's see let's increase this as well to five and there you go so now it is completely blocked so anyways you get the idea so you can put one one here one one with an hd video is fine you know it looks good but if you have a full hd video then one one is is way too subtle at least to me like i can't really see the points here so if you put maybe two and you put here two then it is more visible yeah something like that so anyways you can play around with this all day and you can see which one suits you the best so this is basically the the basic idea of how you can draw now that is all well and good but in reality when you are creating a project you need to use these points you need to know their positioning to actually use them in a project so how can you get the actual points now there are a total of 406 points so that's a lot but what we can do is we can at least look at them and we can maybe number them maybe you don't know which one is which so we can put the numbering over there to see which is the nose which are the starting of the lips starting off the eye edge of the eye and so on so how do you get these values so what we will do is now here we are getting into one face so face lms is the landmarks of one face now in order to now in order to get further deep and find out all the different points we are going to add another loop so we will write here for lm in face landmarks dot landmark we are going to get each of these landmarks and we are going to print it so we are going to write here print lm so if i run this now you will see these are the landmarks so you get the x position ui position and z position so this is the basic idea so now these landmarks we are going to first convert them into pixels so that we can uh use them right now they are normalized from zero to one so we are going to write here i h i w and i c i stands for image image height image width image channels is equals to image dot shape so when when we get the shape now we can multiply it with the normalized values to get the actual pixel values so what we are going to look at is the x and y if you want the z as well you can add that as well but i'm going to write x and y only so to get the x and y value all we have to do is we have to write l m dot x and then we have to multiply it with the width and for the height we will write lm dot y multiplied by eye height so that is the basic idea so now we have our values uh in terms of pixels and we can do whatever we want with them so let's first of all print them out so we will write here x and y and we can comment this part so let's run that and if we go back you can see these are the points that we are getting so if we want to get the id for it well we can put it in a list and we can check the index of that list but if you just want to look at it you can write here enumerate and here you can write id and what you can do is you can write here id so that will give you the id number and then it will tell you the actual value so here you can see it starts at or does it start okay it starts here at zero and then all the way it goes till 467. so we have a total of 468 values and each value has an x and a y point so a y x and a y number so here what we will do is we will put this in a list so what we can do or or let's let's keep it till here and then now what we can do is we can create this into a module and in the module we are going to put it in a list and we will return something so let's keep the basics still here that we are getting the numbers and everything and yeah that should be good so let's run it for two faces and let's run it with video number two there you go so if we go down here you will see a lot of values being generated that's good we could also add an enumerate here and we could write which face number are we talking about face number one or face number two but uh let's let's forget that and let's go ahead and do the what do you call module so the idea of the module is that once you create the module you don't have to write all of this initializations and all of this conversions again and again so all you need to do is you need to call that function or that method within our class and that will do the magic for you and it will return you the values and you will be happy to work with it so how do we do that we right click we go to new we create a python file we call it face mesh module module okay so what we will do is we will go ahead and copy all of our code and now we are going to convert this into a module so for the module the first thing we have to do is we have to uh write what to do if you are running the module by itself so we will write here if underscore underscore name is equals to underscore underscore main then we are going to run our main function and we will define our main function here main and we will write down our loop inside it so we will copy this part actually we will cut it and we will paste it here and then we will go here and we will cut this and we will paste it here and what else we will cut this and we will place it above the while so that gives us the initial part so if we were to comment all of this and if we were to run this it should run so let's try that there we go so this is like we are starting from the scratch um okay so then we are going to convert it into a class so here we are going to write class is equals to uh face mesh detector you can write a better name probably but we are going to use face mesh detector and then we are going to define our initial method uh for initialization and we are going to give in some parameters now these parameters will be the ones that are uh where is it this one so for the face mesh so whatever parameters we have here we are going to give in to our object so let me write down here so we have static image mode so we will write here static mode is equals to false so this will be by default false and then max faces is equals to 2 by default then minimum detection confidence is equal to 0.5 and then minimum track confidence is equals to 0.5 so we are going to write these and then we have to uh tell that these are the values of this instance so we will copy this twice then we will copy this again we will copy this again and we will copy this again so here we will put equals to equals to equals to an equals to and here we will write self self dot this so we will copy that self dot self dot self dot so if you are not familiar with this i would highly recommend that you check out uh object oriented programming uh the basics of object-oriented programming so then we are going to uncomment this and we are going to write self in front of each of these and then we are going to write self here as well and here in the max number of faces and all of this we need to write our new variables so here we have self.static mode self dot faces then minimum detection let's write it in a new line and then minimum tracking so that is the idea so that is good for initializations uh this if you want to make it a parameter here as well feel free to do that i think it should be fine without it okay so next we are going to write a function called or a method called find mesh face or faces i don't know find face mesh let's say face mesh and then we will write of course self and then why is it not okay the indentation is wrong it should be here okay so then we are going to write image and draw so we will have a option to uh draw or not draw so it will be a flag so we can uncomment this and we can go back up here and then we will see what is missing so this indentation is wrong so we need to go back okay so then we will just copy this self dot and again we will start putting the self dot everywhere and self.results uh self.mp draw whatever it's giving an error just put a self dots okay so that is good and wait what happened here i need to go back okay so that is good and now we should be able to see our results without going into the returning part we should be able to see the result but we didn't create an object or we didn't call the find face so it will not do anything so we need to do that we will write here detector detector is equals to face mesh detector and then here we are going to write here that our image is equals to detector dot face mesh or is it find find face mesh here find face mesh and we will give in our image and we will keep its true for the drawing part and here we are going to return our image so we will write here return image so that should be good and let's see if it works there we go so now it's working as an object but the last thing we have to do is we have to convert this so that we are getting our values in return so that's the main thing so and again the drawing part again is optional so we can directly put here if draw then do this so if i run this now it should draw if i go down here and i write here false false it should not draw it will work but it will not draw anything so that's good okay so what is next yeah so now we need to uncomment this and for every face we are going to go through every landmark and through every landmark we are going to convert it into x and y and then we need to store it so where do we store it we store it in let's say uh a variable called face so this will be a list so we will store it in list face dot append and we are going to append the x and y so this will be the x and y value now that is good for one face but we have multiple faces so what do we do we create another list we call it faces plural and this time around we append after the loop we append faces dot append face so basically when we are looking for the landmarks we append the landmarks in the faces and then once we have that face with all the landmarks then we append the faces so that we get the final result so that is the basic idea actually let's put it outside so it doesn't give random error that uh it has been used before declaration so yeah so in any case we are going to return faces so even if it's empty it doesn't matter we are going to return it so that's the idea and then in the image here i can write here faces and that should return the faces um what we can do we can print here so we can write here if the how can we write this uh the length of faces but if we write the length of faces it will be yeah it will be something okay yeah that should work if the length of faces is not equals to zero then we are going to print the length of faces let's say so let's see how many faces do we get okay we are printing a lot of things that's why we are not having a clear picture so yeah let's run it again there you go so now it is showing one face and why is it showing one face because we put we put maximum two faces why is it showing one that is weird max faces is two did we say something here are we running the module yeah we are running the module still saying one okay let's see why does it say one so it should be four oh this should be inside the loop my bad so now it should work there you go so now you have two faces being detected and if we go to video number one uh where is that with your number one then it will be one face so you can see here it's only one face so that is the idea and then what we can do is to check whether we are on the uh correct path or not we can write here faces at zero let's see what does it print so oh actually it's printing the length no no we yeah actually it's good to see that it has 468 points so that means we are getting all the points that is good but now let's print all the points it will be a long list there you go so now you can see these are all the different points that we have so this is quite good okay so one more thing we can do now if you're not familiar with which point is which number then what you can do is you can print the id number over here so you can write that cv2 actually we wrote it somewhere yeah why write it again if we have it already we can write here that cv2 dot put text and we are going to put the text of our id so let's just write here string id and where do we need to put it we need to put it at x and y so this is the x and y position uh 3 is way too big so we are going to put it as 1 1. uh let's see how that works out so this is going to print out the id number of each of the points oh boy okay so it's like a matrix okay so that is bad um maybe we need to look at a video that is more focused on the face doesn't have a lot of other stuff let's see maybe this one maybe this one will be better let's try that this number six there you go so now it's much better so still not that good we first of all let's let's change the maximum to uh max faces is equals to one so we only look at one face and then let's put this as 0.5 okay it's going to the kid i was hoping it will attach to the elderly person but no did not happen okay let's keep the maximum faces as 2 and let's make this even smaller 0.3 let's try that no that's not readable 0.5 is let's try that again okay so i can see one is here four five then it goes to one ninety five one fifty one nine eight so yeah it's a little bit harder to read still 1.7 yeah now it's a little bit better here at the edges you can see very clearly what are the points numbers so 21 54 103 67 and so on but in the middle area especially with the nose i think one i can see one here so one is the nose uh the center of the nose let's let's try another video hopefully we will get something better so number by the way you can read all of this in the paper so if you go to media pipe website they have a paper listed there and if you go in the paper you will find these you'll find more information on these points so yeah now it's hard to see with this maybe if you have just an image and you apply that apply this method on the image and then you scale it up to see the numbers maybe that will work so anyways this is the basic idea of how you can detect 468 points on a face and that running on a cpu all of this running only on a cpu so that is a pretty amazing task and the results are pretty good you can see let's try out different videos so we have video number one you can see it is pretty good video number two actually let's let's uh remove this part let's remove the id and let's keep it normal yeah so let's try it again wait what happened there uh oh yeah the draw is false so we need to remove that there you go that is pretty good you can see it is very smooth it's very smooth yeah then let's try number three there you go even when the faces are a bit far it is detecting quite well number four that is good okay when it goes to the side it disappears uh that is understandable uh then let's try number five this seems like a like a zoom meeting yeah could be used in that with the frame rate we are getting it could be used in a zoom meeting okay let's try number six there you go uh it's flickering a little bit here maybe because they're merging the faces at some point we are probably because of that then let's try number seven how many do we have eight yes okay that's good okay when she goes down then uh of course it will not detect but as soon as she gets back up uh the face is detected there you go the person is laughing and you can see that that is pretty good okay so this is it uh for today's video i hope you have learned something new if you like the video give it a thumbs up and don't forget to subscribe and share it with your friends and i will see you in the next one
Info
Channel: Murtaza's Workshop - Robotics and AI
Views: 89,118
Rating: 4.966186 out of 5
Keywords:
Id: V9bzew8A1tc
Channel Id: undefined
Length: 39min 33sec (2373 seconds)
Published: Sat May 01 2021
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.