AI for Everyone LESSON 18: Mediapipe for Hand Detection and Pose Estimation

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
hello guys this is paul mcquarter with toptechboy.com and we're here today with episode number 18 in our incredible new tutorial series where you're going to learn artificial intelligence or you're going to die trying what i'm going to need you to do is pour yourself a nice tall cup of ice cold coffee that would be straight up black coffee poured over ice no sugar no sweeteners none needed and as you're getting out your coffee as always i want to give a shout out to you guys who are helping me out over at patreon it is your support and your encouragement that keeps this great content coming you guys that are not helping out yet take a look down in the description there is a link over to my patreon account think about hopping on over there and hooking a brother up but enough of this shameless self-promotion let's jump in and let's talk about what i am going to teach you today and what i'm going to teach you is some really cool and some really exciting stuff and what i'm going to show you how to do is how to do hand detection and pose estimation and what do i mean by pose estimation not only do you find the hands but you find the location of every little knuckle in every little joint of your hand so that you can not only know where the hand is but what the hand is doing this opens up some really exciting possibilities that you can start doing gesture based computing where your webcam is watching you and decides to do certain things on your computer based on what your hand is doing pretty neat pretty cool right now to be honest with you this is something that i could have taught you in lesson number one but if i would have done this in lesson number one you would have just been copying and pasting what i was doing you would have never understood it i spent 18 lessons or the first 17 lessons that was 34 videos with the homework solutions 34 videos to get you to the point today that when i teach you this you will actually understand it you will understand python array and data structures you will understand opencv you will understand how to treat an image from a camera is nothing more than a data set that you can analyze and you can manipulate and because we have laid the foundation you're going to be able to not only understand what i'm going to teach you today you're going to be able to take it out and do projects on your own do some really exciting stuff on your own but enough of this introductory banner banter let's jump in and let's get ready to learn some new stuff so what i will need you to do is open up the most excellent visual studio code and what we're going to do is we're going to come over here and i better get out of your way before you get very very angry with me we're going to come to our python working folder we are going to say add a new file and this is going to be [Music] opencv-28 i do believe dot p-y-n-dot p-y is kind of important we're gonna hit enter and boom a fresh new python program just waiting to be written all right what we're going to do now is we don't want to start from scratch we want to start from our standard program that launches the webcam so i will need you to go to the most excellent wwe come down to this happy little search bar and what i need you to search for uh search on is faster launch faster launch and with a little luck we should have this come up faster launch of webcam and so that is what i indeed want and here it is we are going to come down here and we're going to grab this little bit of code by clicking on these two page icons selects the text right mouse click copy you snagged it you come over here and you paste and hey guys i got a new camera and this camera runs really fast without competing with wirecast my studio software so now my camera stuff and my opencv stuff are working uh much quicker they're running much faster now because this camera isn't bogging everything down but there is one little thing about this camera that i am going to have to uh make an adjustment first of all here that uh you know that if you have a webcam you're probably on camera zero or camera one i believe this new camera is camera 4 for me you don't have to worry about that just make it whatever works for you and then the other little thing is is that for this camera as much as i love it it does not take these cam set commands and so i need to resize my frame right after i read it and so i just need to do a frame is equal to cv2 dot resize why am i showing you this so if you see it in my code you know what it is for a frame and then what i want to go is i want to resize it as width and height like that so here i got to read it in big but then i immediately make it the size that i want which we could say here is 1280 by 720. let's run this thing just to make sure that we haven't broken anything in the first three minutes of the lesson so let's run it and see if this thing is going to work boom beautiful view of the top of my green screen in my ceiling and why did i put it up there so that i can give you hand gestures and they will all be appropriate hand gestures okay and so have a little sip of a copy very happy do you see how clear that is and you see how it runs really fast without without bogging down my studio software so i'm glad i've got that working better now okay so now we are ready to start doing uh hand detection and pose estimation all right we're ready to do hand detection and pose estimation so what we need to do is we need to install a library now you guys that have been following along in this class taking this class it's going to make a lot of sense and all you got to do is do what i do if you're one of these drive-by shooters that just came into the middle of the class and landed on this lesson what you have to know is is that i am operating on i am operating on python version 3.6.6 i'm in opencv version 4.5.3 and i'm about to install mediapipe now i'm going to do it in the virtual environment that we're working on in this class in our ai virtual environment we're going to install it in that virtual environment if you're just hopping in here from somewhere else you just need to make sure that you install this in whatever your working python environment is okay so i think that you can follow along if you want to understand what i'm doing and how i'm doing it look in my ai playlist my ai for everyone playlist i believe it's lesson number three i explain about virtual environments okay but what you know in these virtual environments is we are operating in the virtual environment you can see it down here the pi a i 3.6 this pi ai 3.6 is the virtual environment that we're working in well if we want to install a library in that virtual environment we better activate it so i'm going to come down here and let's see you probably cannot see that there i think you can see that so what i'm going to do is i'm in my python working folder here and then what i'm going to do is i'm going to activate that virtual environment so it is pi a i 3.6 slash scripts with an uppercase s and then an activate and just to show you uh right we got our python main working folder and then we have this pi ai 3.6 virtual environment and then what i am going to do is i am going to activate that and boom pi a ai 3.6 we are in our virtual environment and just to make sure i'm going to type python it should come back python 3.6.6 very good it did and so let's just see if i say import media pipe i get an error it's not found why because we hadn't installed it okay so now that is what we are going to do and so we are going to do our old friend pip install and we are going to install media pipe media pipe can you see that and now we're going to hit return and we're going to hope that it finds something collecting media pop so far so good we'd like to see the happy little things start dancing around in front of us here we get a little nervous when it just sits there appearing to not do anything but we will remain calm so far no error no problem yours is probably going to go a little faster than mine ah there it goes okay so it looks like we are still starting a download that my friend is a very good sign it's probably going to take about two minutes to download and install this yours might very well go faster because while i have a very very fast computer my internet connection leaves a little bit to be desired sometimes so it might take me a little bit longer than you but we will just sit here and visit for a few minutes while this thing downloads and installs and what this is is is this is the library that we will use inside of opencv to be able to find the hand and then to identify all the little locations on the hand where the tip of your pinky finger is where your thumb joins the palm all of that sort of stuff and boom we just got a successful install i always like to start just to make sure for good measure i'll type python and then i'll say import media pipe before i go on i want to make sure boom no errors and so we should be good to go there and then i don't need to sit inside of this virtual environment so i will deactivate it okay now what we're going to do is we're going to come up here to our program that launches the webcam and after importing our old friend cv2 after importing our old friends cv2 we're going to import our new friend media pipe and because i don't want to sit here all afternoon typing media pipe we will import media pipe as mp and so i can reference the media pipe library by simply referencing mp okay now there's going to be two objects that we need to create we need to create an object that does the hand detection and the hand analysis i'm going to call that object hands hands plural why because i might have two hands in the picture i might have more than one hands in the picture so this object is going to be associated with some number of hands and so hands is going to be mp for our media pipe it's going to reference that library and then in that library it is going to reference solutions and then little hands this is completely this little hands is completely different than this little hands this little hands on the left i could have called it a kitty litter box or whatever i wanted to but this hands with the little h that references very specific methods inside of the mp.solution library okay and so we're going to call our object hands mp dot solution solutions dot hands dot okay lost my hands okay mp dot solution dot hands dot wait for it hands with an uppercase h do you think they could have made it a little more confusing probably not but there's a solution dot hands with the lower h and dot hands with the upper h okay now we've got to give it some parameters the first thing is we got to tell it whether we are dealing with a still image because if we're just going in and analyzing a still image it looks carefully it does everything very organized in a certain very quick way or a very in-depth way because it's just a single image but if it's not a static image we need to tell it false it is not a static image well we are going to be working on the webcam so it's not going to be a static image so is static image no false it's not okay then what is the maximum number of hands it should look at well i don't mean to be parochial but i am a two-handed individual so i'm gonna look for two hands you guys out there that are maybe three-handed individuals you're gonna have to figure out how to handle that but i am doing this lesson for the two-handed individuals okay so i'm going to say i will look for two hands and the reason i'm not looking for four is not only do i not have four hands but i'm not concerned about like having a group of people and finding all the hands i'm thinking like i'm in front of the computer and i want to do some sort of hand gestures to make the computer do things so i'm thinking in terms of gesture so i'm only going to be looking for two hands and then you got to kind of tell it how competent do you need to be in the tracking wall to say 0.5 and then how confident do you need to do how confident do you need to be in the tracking i'll say 0.5 i played around with those two numbers and it doesn't seem like it's really worth fooling with so i just use the default of 0.5 and 0.5 if anybody can tell me why when i hit a point it gives me a double point well of course it didn't do it that time but it seems like this auto fill for some reason is giving me two points when i type in one point and that is getting most annoying one of you smart guys out there tell me how to make that stop okay i've got this hands object that will look at a frame and find and analyze the hands now i might want to draw those hands or draw that data or plot that data so i'm going to create a draw object mp draw you can call this whatever you want i call it mp draw and then that is going to be mp.solutions just like above mp.solutions but then what i'm going to call it there is that double thing do you see that why does it make me think i swear that thing is giving me doubles okay mp.solutions.drawing [Music] utils drawing utils okay you guys i hope can see that should i make it a little bigger okay maybe that'll help you a little bit see what i'm doing a little better okay so i'm going to do a drawing object called mp.solutions.drawingutility so if i want to analyze a frame for hands i use the hands object if i want to annotate my frame with the data i use the mp draw object okay i think that should make pretty good sense and so you're going to probably be surprised how easy this is going to be but the first thing is i read the frame and i resize it okay that makes sense right you understand why i had to resize it you don't have to resize it now opencv when it grabs that frame it works in what blue green red bgr the rest of the known universe operates in what rgb including media pipes so what do we better do we better create a new version of frame which is b g r and that is going to be equal to c 2 cv2.cvt color cvt color and then what do we want to change the color of we want to change the color of frame okay and then what do we want to man this is really getting annoying we want to change the color of frame and how do we want to do it cv2 dot color underscore we want to go b g r to r g b like that okay so now we got a frame bgr or i shouldn't call that bgr should i that is frame rgb frame rgb is the rgb version of the frame that we just read in okay now what do we want to do we want to analyze that so we want to get some results you could call this pumpkin patch if you want to but i'm going to call it results because it's what's going to come back from this from this hands object that i created so what am i going to do i am going to call the hands object and then what i want to do is i want to process and what do i want to process frame framework frame what frame rgb okay and so this then should go out and analyze it this won't do anything i just want to make sure i can run it without the thing dying okay so it launched launched the camera i can wave at the camera it's doing analysis but that's all it's doing it's just returning it into results and we haven't really looked at results well the first thing we need to do is say did we get a result because if we don't get a result it's just going to return like an empty array so is there a result and so like when i first ran it there was no hand in there so there would be no results well you don't want to go further if you don't have a result so if you say if you're going to say if results and what part of results will be multi-hand landmarks okay now i got to tell you results isn't like a simple array or something like that it is a very very complicated data structure and it's very very hard to understand and therefore what we're going to be spending time on today is trying to break that apart but the one thing is is that those arrays of hand positions and the positions of all the parts of your hands those come back in the results object and they are inside of results dot multi-hand landmark now you're going to have multiple hands you would have all the landmarks for this hand and you would have all the landmarks for this hand but to begin with we're just getting all of that data in results.multi-hand landmarks now if results thought multi-hand landmark is not equal to none so if it is equal to none there are no results you didn't find a hand don't do anything go get another frame okay but if it's not equal to none that means there's something there you found a hand or you found multiple hands so then you want to jump in and you want to start analyzing that okay you want to start analyzing that now what you have to see in this results dot multi-hand landmarks if there's more than one hand you're going to have more than one array of data so again this is going to be kind of like that concept of an array of arrays each hand is going to have an array so what do we want to do with that we want to step through it so i'm going to say 4 hand land marks hand landmarks in what results dot multi hand land marks i hope that's right okay now does that make sense this results dot multi-hand landmarks could have multiple hands it could have multiple sets of landmarks and what do i mean by landmarks i guess i should stop and show you this is what it actually returns this is kind of a picture of what it returns it looks at your hand is that the right way i hope it's the right way it looks at the hand and it divides your hand into 21 different points the position of your wrist is point zero the position of where your thumb now i think it's actually the bottom part of your thumb that is inside of your palm that's point one and then where your thumb comes out of your palm that's two and then this joint is three and the tip is four so you kind of have five points from the wrist up to the tip of the tongue of the thumb and then each finger sort of has four different points on it and so if you take the wrist and then five times four you end up with 21 points to define sort of what your hand is doing and i hope that makes sense and they're sort of indexed in this way from 0 to from 0 to 20 20 is your pinky tip okay 20 is your pinky tip and that is kind of how they're indexed now you don't have to worry about these names like thumb underscore tip or index underscore finger under underscore mct you just got to kind of keep this data you've got to kind of keep this data picture handy and then you have to know that it returns the data in this order so the first set of data is going to be for this position the second is going to be for here third fourth fifth and so forth so you got you got to know the data is in order and then if you have this picture you have no problems at all so now there could be multiple hands and so we're going to step through the multiple hands all right multiple hands would be results dot multi-hand landmarks and then when we step through it one hand at a time that is going to be hand land marks why is it plural because one hand has 21 data points in it so there are landmarks on the hand so the hand is singular now and landmarks is plural because there's 21 points this probably like if i would have done it i probably would have called this uh let me come back over here i probably would have called this i probably would have called this multi underscore hands plural that's what i would have called it because this has multiple hands and then the hand landmarks is one hand the 21 points associated with that one hand i don't mean to get stuck on this but man this is a really complicated data set i'm trying to kind of show you how we're going to bust into it to do something useful so we're going to step through the hands one hand at a time and that is what we are doing here okay so now what are we going to do as we step through the hands what do we want to do we want to draw them so we are going to mp draw dot what we want to draw underscore land marks and then where do we want to draw them we want to draw them on frame we just use frame rgb to do the analysis but when we do the annotation when we do the drawing we want to put it on our original picture because that is where we are going to that is what we are going to display and so we want to draw it on frame and what do we wanted to put on it hand land marks and so it's the singular so this step is going to draw the hand now if you have multiple hands as you go through this loop it's going to draw this hand draw this hand draw this is going to put all the hands on there but we draw them one hand at a time so that's hand land marks okay does that make sense if this doesn't go back and watch it again and i know i'm repeating myself but i want you to understand this data structure okay so now here we draw the hand landmarks okay so now i think if we run this it might just work okay hold your breath good news it ran it didn't crash now if i put my hand up there what's going to happen boom look at that okay it put a point it put a point at each one of those right the wrist where the thumb joint the the thumb joint inside the palm okay where the palm where the thumb comes out of the palm the thumb joint and the thumb tip all the way out to that data point number 20 which is the tip the tip of the pinky right now let's look at this okay two hands boom we can analyze two hands now if i did have a third hand which i don't but if i did have a third hand and put it up there it wouldn't show it because i just said to show to show just two hands we could make it for more hands but i did it just for two okay boom so there we are let's do one other thing that's kind of neat here what we can also do is if we want it to be a little bit more obvious when we do the draw we can come in and we can add to it mp.solutions mp.solutions and can you guys see that mp.solutions you can almost okay there you certainly can mp.solutions okay and then mp.solution dot dot hands plural dot hand upper case singular hand cone neck shuns hand connections make sure you can see that yeah you can see that so we're going to add to the hand landmarks which were the little red dots we're going to add mp solutions dot hands and then the hand connections okay and i think this is pretty neat boom look at that okay so you see now we've got like a little wire frame we have a little wire frame of our hands and that is really starting to annoy me let's see if i can okay there that gives me a little more space look at that so you see we have these little wire frame hands that is pretty cool okay so let's talk about the good news and let's talk about the bad news what's the good news is the good news here in probably about 15 20 minutes i have you where you are doing po you're finding the hands and you're doing pose estimation that is the good news what is the bad news the bad news is really all you have done is run one library that somebody else wrote and then take the output of that library and pass it to another library that someone else wrote and you're kind of like taking the output from a and feeding it to b and that's all you're doing and you really don't have a clue what you're doing and you really don't have a clue how this thing works and so yeah you could impress your friends by showing them your pose estimation and wow look at this program i wrote but then if you you have all these ideas of things you could do you can't go do those things because you really don't understand the data structure because we caught the results data coming from our hands object we caught it and then we passed it to mp draw and we didn't even bust it open and the problem is the bad news is is it is a very very complicated data structure that is very very poorly documented and it's kind of like this really delicious nut that is inside of this really really hard shell and it is hard to bust that shell open and get that juicy delicious little nut out well that is what we are going to do now i'm going to show you how to get that delicious nut out of that hard ugly shell and if you watch me you will understand that and you will be able to go out and code on your own and you will be able to do things on your own and you will be able to do things like gesture based computing and all of that because i am not going to stop here just by having you do the cool demo i'm going to teach you how that data structure actually works does that sound pretty neat okay i hope it does time for a little swig of coffee so what we need to do is we need to see if we can just go in and try to kind of bust into that nut and see what's happening well what's the first thing that we got first thing that we got was results so what would be the obvious thing let's just see if we can print results like that what if we it can we look at that and do we see some arrays or some things that we could kind of figure out what's where so let's just print results we will come up here we will run it okay and the good news is it didn't crash but the bad news is you can see that what that results is it seems to be some sort of like class object and it's not like a simple thing of like an array of arrays or some simple data structure that you could then go in and start trying to look at that go in and grab the grab the data out of but at least we can kind of see that it was a class that that it returned well what we saw is we saw that we were able to step through results so there's got to be kind of like a chunk of something and a chunk of something and a chunk of something and we can step through those chunks and so if the whole thing didn't make sense can we just look at a chunk well what is the chunk it is print hand land marks that is what we get this is getting annoying what is this doing hand land marks like that print hand landmarks so what i want is instead of printing out the whole results what i want to print out is i want to print out the chunk the hand landmarks chunk and let's look at that and see if it starts looking more like data okay here all we're seeing is just that media based solutions but there's no hand and because there's no hand it would never get in that if statement so it would never print out data but let's give it a hand boom look at that seems like data is coming out so let's look at that data okay look at that so what it looks like is you've got an you've got some sort of kind of like data structure and probably what these are is these are for the 20 landmarks and this this last one would probably would be like number 20 which is the 21st uh really starting with zero there's 21 but this would be like the index of 20 or the 21st element and it's giving us what looks like the x position on the uh you know on the frame and the y position on the frame and it actually looks like it's actually giving a z maybe it's looking at how big that hand is and figuring out like how close or how far from the camera it is but we really don't care about z we really don't care about z what we care about is x and y now if we look at these x and y values one of the things you'll see is they're between 0 and 1 and so probably like for probably for x 0 would be all the way to the left of the frame and one would be all the way to the right of the frame so it's probably a zero to one full scale on the frame and same thing from y zero would be the top and uh one would be the bottom of the frame and so it's sort of like normalized to one so if we're going to make it useful in a minute we're going to have to like normalize it from one but you see the problem is we see all the data there but we got to kind of bust it down and we've got to kind of see what we really want is we want to know kind of like what would we like to have we would like to have an array and then what we would like to have in that array what we would like to have in that array is 20 data points or 21 data points 21 tuples of the x value and the y value of my wrist and then the x and y value of where my thumb moves inside my palm and then the x and y value of where my thumb comes out of my palm and then the x y value of the joint of my thumb and the x y value of the tip of my thumb in uh in tuples and so kind of just like if i was to to show you that what what i would sort of like is i would like to have something that is an array okay and inside the array are data points of like uh the the x0 comma y 0 as a tuple and then the next one would be the x1 comma y1 and then comma and then on down the line and then i would have a little data packet that would have all the information about my hand and that would correspond to this that if you look at like this position 4 here that position 4 has an x and y value of that point that end of my that end of my thumb i hope that makes sense so let's just jump back in here and then let me get rid of this and then let's see how we can try to break this down further to start pulling actual x and y values out of this that we can use well you can see that it seems to say this is landmark this is landmark this is landmark and so what i'm kind of hoping is i'm kind of hoping if i come here that i go all the way inside here i go all the way inside here and what i would like to be able to do is i would like to see now if this hand landmarks steps through the hands in the picture but the hand landmarks which is what we printed it has a whole bunch of data in it well i wonder and now i'm going to take this print out i'm wondering if i can step through if i can now step through the what step through the landmarks on a hand so here i'm stepping through the hands but now i want to see if i can step through for what am i going to call this i'm going to step through the land mark i'll call it singular i'm going to step through the landmark in results dot multi hand underscore let's see do i really want to do that no no no no okay landmark is going to step through hand hand land marks okay so right i've got the hand landmarks as the one hand now i want to step through the individual landmark and hand landmarks dot land mark okay and that's telling it to take this whole array of all 21 landmarks and one by one step through landmark and so this l-a-n-d-m-a-r-k that is what i am putting here this one i could call it whatever i wanted to and i called it landmark upper l does that make sense so now that i have the data on the one hand i am going to step through the landmarks and then i need to somehow let's just see if we can print them okay so now what do i want to print i'm going to print an x y tuple and then look at this it seems like just like we could go dot landmark we could probably go dot landmark dot x and dot y and so what i could do here is landmark dot x comma landmark dot y why do i keep repeating myself so much because i want you to understand it and not just what i not just do what i'm doing land marks is all the data associated with one hand multi-hand landmarks is all the hands i go out and i grab them one at hand at a time in hand land marks and now all of those hand landmarks all of those hand land marks i want to step through them one at a time and then i want to grab the dot x and the dot y the dot x and the dot y and i want to print them out okay and then what i want to do after that i just want to print a blank line just so i can kind of see i just want to print like that outside the for loop and what's that going to do that's just going to it should print out 20 xy values associated or 21 xy values associated with all those points on my hand and then it should print a blank line so let's see if that'll work now initially it shouldn't do anything because it shouldn't see a hand all right and i need to get this all the way down here okay that looks good let me call the window back up okay so it's just saying that class that it's printing results and now let's see boom look at that okay so i quit and you can see that these are x y tuples okay and i should have like 21 of them in a group and then you should have another 21 in a group and another 21 in a group so you see what i'm doing i'm going to the frame i'm finding all the hands then i step through the hands and then when i have a hand then i step through all the points and then i grab the points well right now i'm not grabbing them but i am just printing them so what if i want to grab them okay what i would do is when i'm stepping through hand land marks okay when i'm stepping through hand land marks i'm going to create this is my hand and that's just going to be equal to a blank it's going to be equal to an empty array so once i'm in here i have found a hand so now i create an empty array called my hand and now i want to append to that well what do i want to append to it i want to append the individual landmarks which are here so what do i do i take my hand and then what do i want to do with it i want to dot append and then what do i want to append i want to append this pole that i printed an x comma y i want to put that in like that all right now i'm going to take this print out and now after i fall out of this after i've stepped through all those points i'm just going to print my hand like that i'm just going to print my hand like that and now i should have an array that has x y tuples in it so let's see if this will work okay so what we're going to do here is see if we can get that down all right called this up and now let's give it a hand boom look at that okay so do you see what this is this is an array you see we're getting an array of tuples x y points so this my hands has all the data associated with this hand okay all the data associated with this hand but now you got to think okay so let's all right so that's good that's good so let's see now if we could do something on our own i am interested in the tip of my pinky i am interested in the tip of my pinky okay so what do i do i'm going to come in here and first of all i'm going to take out using this canned routine of mp draw because that's cool but you don't have any control over it so i get rid of mp draw and then at this point i've got this cool array i got this cool array of my hand i've got the whole array of my whole hand well what do i want to do well i'm going to do a cv2 dot circle cv2 dot circle where do i want to put the circle on the frame and then where do i want to put it the center would be where well i want it at my pinky where's your pinky your pinky is element number 20. the tip of your pinky is element number 20. so when i come back over here what would be element number 20 what would be the array my hand right this nice huge array down here it would be my hand and then which element would it be it would be element 20 which is my pinky so i want to put on the frame and i want to put a circle at at the point my hand 20 which would should be the tip of my pinky and then a radius well let's give it a little bit bigger radius let's say 25 and then the color let's make it 255 blue green red let's make it blue and red 0 comma 255. so blue and red should make purple if i am thinking right so that should be purple and then i am sorry i'm going to have to make this smaller otherwise that intellisense is going to drive us crazy so now we've got the color and now we need the thickness well let's make it like a thickness of let's make it solid which would be -1 i do believe so now let's see if i'm thinking right we should have the position of the tip of my pinky what types of crazy mistakes do i make now the fact it didn't crash should not make us necessarily feel good because we haven't shown it a hand yet and so nothing could really happen so let's see oh and we did crash in fact uh circle overload resolution ah yeah yeah yeah yeah yeah okay so what is the problem what is the problem that we are still in this goofy system of having x and y normalized to one that the full frame would be one and so we have to normalize that and the way we would normalize that and we should probably go ahead and do it here where i do the append of landmark x and landmark y i should do the conversion there i need to multiply the x by what width and i need to multiply the y by what height and now if we did this it's going to crash because this wants integers and so i need to make this an int and then i need to make this an int so that it's round numbers in there and then let's check our parentheses this parentheses closes the append this parentheses closes the tuple and this parentheses closes the int and so then i have one two three open i've got one two three closed i think that should work so now let's try that okay so it's waiting for a hand and now i give it a hand boom pinky detected who's your huckleberry look at that the little pinky is detected okay but now i want you to think something what if i put another hand in here what if i put another hand in here hey it did work okay and so for each hand it finds the pinky for each hand it finds the pinky look at that that is pretty darn cool isn't it okay now i'll show you one other thing that i feel like we should do with the data structure because it's sort of like analyzed a hand drew the circle analyzed the hand drew the circle i do feel like that we need one data set that has all the data for both hands and so i think the way that i would do that is because i'm trying to make like a a little nugget of code that would completely generate a simple to understand a simple to understand data structure that would allow us to have all the data for all the hands in one place and so kind of what i would like is i would like my hands plural and what that would be is it would be an array of of what an array of hands this would be the first hand and this would be the second hand okay so we would have the an array that has a first array which is the first hand the second array is the second hand now if you only had one hand it would look like that and if you had no hands it would look like that but you're going to say let's say that's the first hand and then that's the second hand and then you would have inside of that tuples x0 comma y zero and then comma another tuple so that would be right that that inside your palm joint and then you would have the x1 comma y1 and that would be the next joint and then comma and then dot dot dot and that was all for this hand then you would have the same thing for this hand over here and so you would have another ah undo [Music] control c okay you would have over here inside of this second array you would have another set of x y points for your other hand now how are we going to do that well after we create my hand here so the whole my hand here thing is done after we create the my hand then what i'm going to say is my hands [Music] plural is going to be equal to my or i'm going to say my hands dot append and what do i want to to append to it my hand so once i create that array for the hand i append it to the hands plural and now i've got to set up my hands and so when you grab a frame you better go ahead and make that you better go ahead and make that an empty array and so every time through after it looks at the frame it will give you this data structure okay so now the only thing i'm asking here is is that my hand okay yeah i do it f i create my hand as i step through here and then when i leave the for loop then i can print my hand singular or i can i can append it to my hands plural and so i'm going to take this print out and now i'm going to just print here print my hands so this time if i show you both hands you should get two arrays inside of there and that might be kind of hard to see and then i'll just do a print blank line okay so there it's saying nothing okay now i'm getting an array that has one array in it and if you look at the data scrolling by you can see that there is one array inside of the array there is one array inside of the array okay and now with two hands you can now see that i'm with two hands you can see that i'm getting two arrays inside of that array and both pinkies are detected man is this just me or is this like something just absolutely super cool absolutely super cool so with this you can see that you could get any hand that you could get anything that you wanted so like i could come up here and i could say let's uh let's do the whole darn pinky okay let's do the whole darn pinky so let's come over here and what we would want is 17 18 19 20. we would want 17 18 19 20. so what we're going to do is that is 17 18 19 and i already had 20 and so let's make this take the 17th point the 18th point the 19th point and the 20th point okay now that is going to be focused on my pinky okay boom look at that look at that beautiful pinky okay i think that's a little bit maybe when we're doing that many that's a little bit large okay that's a lot better and let's see if it'll do yeah we got both hands that are working all right guys oh let's see you are not seeing that worry let's come over here okay there it is and then there it is and hopefully you are not in the dark for too long but i'll come back and show you just in case you didn't see the code so i just took this cv2 circle and that copied it and i pasted it one two three four and then i say okay take point 17 18 19 20 and draw circles on them because those are the ones associated with my pinky so i apologize you were in the dark there for just a minute but that should uh that should be pretty clear so now you could see all of the sudden you think hmm i could calculate the distance between the two pinkies and i could do something based on that i could affect things based on what my hands are doing or i could have something i could say what does a thumbs up look like with these points how would you measure a thumbs up versus a thumbs down and maybe you could maybe you could like walk through pictures and you could give the person a thumbs up or thumbs down as you're walking through it and then you could have a you could have a folder of winners and losers or something like that okay you see there's just all types of things that you can do now all right but yeah i could give you a cool homework like that but i feel like the homework that i should give you is a little bit more practical that you see that what we have done here is with these simple things we have busted apart that tough nut of results and now we've got the actual data that we can use for any point that we're interested in but this is a little cumbersome to every time start with this program and it's going to get a little bit confusing so what i'm going to suggest is is that you create that this is your homework for uh this is your homework for uh next time and that is is that you take this and you create like a class where you create a class or you could create a class or you could create a function where you pass the function the frame and then what it returns to you what it returns to you is this simple data structure that i created in the data structure really should be like this my hands plural the array of arrays of hands and it returns that to you and then once you have that then you can go in and you can grab whichever point you want but you want to unload all of this parsing of the data to either a function or a class okay i will probably do it with a class okay i will probably do it with a class and so you can do it with a class or you can do it with a function but get this where we take all this and we offload it to a helper function and then as we move forward we're always just going to use that helper function and we don't have to worry and sit and stew over these data over these data you know these data structures anymore does that sound reasonable guys i hope you're having as much fun with this as as i am i really thought that uh that worked last week where we did the os walk to walk through images i thought that was really fun and then i thought this is just really really super fun and i start seeing all types of things that we can do use using gestures which i think would be a whole lot of fun okay this has been a long lesson how many of you guys have made it to the end of the lesson okay how many of you have made it to the end of the lesson well couple of things first of all if you are successful on this homework leave a comment down below i am legend and if you're not leave a comment i fold it up like a cheap long chair that's not the worst outcome trying and failing is not nearly as bad as sitting around and watching silly cat videos and so if you try and fail that is not really a failure because it's called learning the the the bad thing is people who don't try so don't be embarrassed if you don't get it to work but let me know how many of you guys were able to get it to work plus you have a bonus just by what making it to the end of this video and so you make it to the end of the video you can leave the secret word for today which is gregantion birds okay these birds uh i encountered them unexpectedly this morning these are monster birds they're like three feet tall and their beak is like this long and what they did was they got up on my fort front porch and they broke seven windows in my house with those gargantuan noggins they were using them like hammers and they were just going along on my uh on my veranda breaking the windows in the house and we finally got them chased off they were kind of aggressive we got them chased off we kind of got the dogs going after them and they kind of flew up in a tree and then we go about our business they came back and started breaking more windows in the house and so we have a problem with these gregantian birds and i'm not sure what we're gonna do with it because they would they were just walking around knocking out windows just like their head was a hammer and breaking uh breaking our windows and i don't know why so anyway the secret word for today is gargantuan bird and let us know if you were successful in the homework hey if you guys enjoyed this lesson make sure to give us a thumbs up that helps us with youtube if you have not subscribed to the channel already make sure to hit that subscribe button when you do ring that bell so you will get notification of future lessons okay guys this has been paul mcquarter with toptechboy.com i will talk to you guys later
Info
Channel: Paul McWhorter
Views: 1,971
Rating: undefined out of 5
Keywords:
Id: xHK-wv2JG18
Channel Id: undefined
Length: 59min 36sec (3576 seconds)
Published: Tue Nov 02 2021
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.