Making an Following A.I. Drone so I can Vlog Hands-free

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
Chiho guys it's that time again now if you remember not too long ago IBM sent me a drone and challenged me to put AI into it with a deadline and when I was unable to complete it now we may have lost the battle but we didn't lose the war cadets parade bike so I promised you guys a part two so this week I'm gonna deliver that to you and get hyper because I've been thinking of an exciting new approach to try that I think is gonna be miles better than the last implementation so with the last implementation we train an image classifier to detect when I'm too far away too close too far left or right top bottom etc etc which admittedly was a foolish way to try and train an AI on how to move well at least with the uniform data that I was feeding it for training I honestly believe that this approach could be a good practice for a production ready solution but it would have to be a lot more involved maybe using machine learning libraries like PostNet or yellow and like for human and object detection and embedded recurrent neural networks be able to predict motion behaviors facial detection for framing and the whole works and all around the more involved algorithm but unfortunately I didn't have much time in the last go-round to think about let alone involve all of this and unfortunately I'm just not interested in spending in the next few months building such a solution especially at the fact that how many guys said that this already exists so we're gonna implement something a whole lot simpler but you should be enough to get the job done what is this mysterious technique you're curious well instead of training an image classifier to just hope that it makes sense of the pixels on what to do we're gonna use a facial recognition algorithm that gives us a bounding box where the face is then program our drone to keep this box as centre as possible well at a certain distance and I'm feeling pretty good about this solution all through our Christmas break I couldn't get my mind off so without doing any longer let us begin thanks to part one we already got the Telo towel put the video stream so next step is to UM well when trying to complete any project I think it's very important to keep your goal in mind so you don't get lost or confused along the way especially when it comes to machine learning projects so let's make what I like to call a ladder which we will start with the destination and work backwards and here is what I came up with now ultimately what we want to do is be able to launch the tillow and have a track any face but in order to do that we have to hook up some functionality to the Telo output controls but before that we need to create a function that map's bounding boxes to the center of the screen but before that we need to simply just get a facial bounding box to return from the camera feed and that's all that's to this ladder a pretty short and hopefully easy to climb ladder but now that we have it let's start climbing first things first we need to implement the facial recognition algorithm that returned to us a bounding box of the faces that it detects which will be using open CV for that because this is a library that has facial recognition built in and I mean you can't beat a sample platter am i right so I plugged an open CV the facial recognition algorithm with only a few lines and got it returning a bounding box of my face perfect I mean well it's not perfect sometimes it marks random objects as faces when they clearly aren't it pretty much doesn't even work with profile shots and sometimes it can't even detect mug shots but I don't know if it's because the tech just is enough to snuff yet or if the algorithm has a bias for lighter skinned folks or not listen either way it works good enough for me to not want to search for an alternative so we're gonna roll with it and just in case you're wondering how this stuff works on the lower level with open CV you can use what's called a cascade classifier in which you can train using a few hundred samples of a particular object so like say a face or a car what have you then once it's trained you can apply it to images to identify what it was trained on which opens up a new option for us we could train our own cascade to get more control over algorithm but ain't nobody got time for that now that we're getting our bounding box returning from the camera feed next we create our function that will map to our telo output which is gonna be a pretty hands-on process but first things first we need to get the center of the screen and the center of the bounding box so we can use that for this hello output now getting the center of the screen is really simple algebra we just need to take the dimensions of the video resolution and divide the height and width in half and boom Center retention now in order to get the center of the bounding box it's also a relatively simple equation so we have this simple 10 by 10 image and this is vertex 1 2 3 & 4 well in order to get the horizontal Center we simply just need to get either the average of vertex 1 plus 2 or 3 plus 4 because they are both connected vertices along the X or horizontal axes meaning they share the same X values and vertex 1 plus 3 or 2 plus 4 for the center of vertical because they're both connected vertices along the Y or vertical axis meaning they share the same Y values which would give us x and y coordinates of the center of the bounding box and for an even quicker shortcut you can just take the average of one diagonal corner with the other and that will quickly get you the midpoint between them ba-da-ba now that we have the center of screen vector which is a constant meaning you'll always be the same no matter what and we have our censored bounding box vector we can now use all this data to calculate a new vector our distance vector but ok wait before we do this I should probably explain the vectors a bit otherwise it might get a bit confusing from here though I promise it's a really simple concept so let's go through this thoroughly a vector is just a container of data at the fact you're watching my video you probably interact with vectors every single day for example this videos resolution is a vector of 1920 pixels by ten hundred and eighty pixels so in today's case it's just the container of a couple of numbers vector by itself has no real value it is up to us as developers to give it its value and what we want to do is turn these seemingly valueless vectors into spatial coordinates to use in our actual real world space so we have our center of screen constant vector which is 480 by 360 half of the tells resolution and our bounding box vector which is variable so we'll just say it's TX by T UI now let's make a large lot easier and rename these vectors to reflect their purpose we will now call the center of screen vector true vector because it's the center of the screen and that is the true location of the drone as far as we know and the bounding box vector will be called target vector because the bounding box is the target vector that we want the true vector to me ok all pretty simple right next we need to calculate a new vector that we'll call the distance vector which will simply be the distance from Target to true which simply we just subtract target vector from our true vector and that gives us will help the distance between the two hey guys I made a little oopsy-daisy on this part and I just want to clarify for all you guys interested in the mass so this equation will still work the exact same so long as we understand when it's outputting and you'll see that a bit later but I meant to design the equation to be target vector minus true vector because that will return a vector with the actual distance and direction but this currently it only just returns the accurate distance and inverse direction not the biggest deal but let's get on with the video and in order to Center the face on screen the distance vector has to be as close as possible to a zero zero vector give that a test and ah gay okay now we have a slight problem you see our drone is unlocked in 2d space it's not stuck in one spot like this webcam we can test it on this thing can fly in 3d space maybe and so we have to figure out some way to give it a z-axis hmm I think I got it so lucky for us this default mugshot face cascade that we are using does detect faces of different sizes so we simply just need to get a better understanding on how that works then use it to our advantage after the low experimenting it appears that it returns bounding boxes with a few different fixed sizes and so we set the pick which bounding box size is a distance that we want to target which turned out to be 304 I don't know it just looked like the right framing but now that we have that we can now add that as the Z value to our target vector then to get the Z value for the true vector we just need to take the area of the bounding box which is just the length of any side times itself or squared and that will give us the Z value for the true vector so now if we test it again telling you our distance vector to try be at a zero zero zero vector boom goes the dynamite okay sweet so far so good but before we turn over controls the AI we have to do something a bit more tedious you see I originally downloaded this Telos script from the developer die Mia Fuentes his github and the library that he'd used to display the televideo feed was PI game but I don't really know how to use PI game yet and by looking at it it looks a lot more complicated than it needs to be right now so we need to convert this entire script to use OpenCV instead which means first getting telogen to display the camera feed using OpenCV check ignore the blue coloration though then we need to add just a simple face detection algorithm make sure everything still works and is good and all that stuff check now we can have the full algorithm that we wrote using the webcam check and we're done okay I know it didn't seem tedious but that's only because with admony I can skips it you get it and now gents boys and girls bags and baguettes it is time it is time to hook up the controls so let's slow down right off the bat what do we know well we designed it so that V distance X that's distance vectors x value got it got it okay good all right we'll return a positive number if V target X is left of V true X and negative F right of V 2 X now again remember we're trying to get our V distance to a vector of 0 0 0 so hooking up these controls to turn right if V distance dot X is negative and turning left if positive will allow the Telo to try and Center the face on the x axis and the other two accesses are pretty much the same the distance that Y will return a positive number if V target at Y is above be true dot Y so fly up to bring V distance that Y to 0 and vice versa if under and V distance Z will return a positive number if the bounding box is smaller than we specified so fly forward to make the bounding box bigger or to bring the distance dossier 0 both of the same and vice versa if the bounding box is too good I quickly hooked up these controls then excitedly took it for a test and check this out [Music] Oh Oh whoops uh-huh it turns out I hooked up the wrong controls for when it sees a face I meant to tell it to pan left it faces left put instead told it to fly left the tape to take to check this out stop stop stop because it's gonna spin out forever my face look I'm right here ting right here okay oh okay okay so the drone kept rotating forever and I couldn't figure out why but after a good 10 to 15 minutes I figured out that I hooked up the control is to be true instead of the distance vector that we worked so hard to make excitement clouds the logic of even the best of developers so fix that stuff and take three check this out oh oh yes look at it it's keeping me centered I mean it's these little words but it's working ladies and gents boys and girls here we go I will be able to stand up on this flight watch this watch this behold behold I'm over here I'm over here look at me look at me no okay all right okay okay can I stand up are you that flexible are you back on ice you can stand with me oh let me go come on oh hold on I want you fast hold on okay all right all right look at this we're off to a really good start I can sit back down I can sit back down come on come on come on oh oh yeah oh yeah and then I finally decided to hook up the z-axis come on come on come on work that magic where you going where you going I'm over here over here don't run into this don't run into this don't run into okay okay okay okay okay that's close enough back up back up back up back up back up back oh no oh no LAN LAN okay that was scary Hey and they say Skynet isn't going to happen I hand controls over to the little bugger and the first thing he wants to test out is attacking its master hello I want you to weigh in at once [Music] anywho I wanted to fix some of the bugs I was facing for example I didn't give the drone enough of a safety zone so a V distance wasn't exactly on zero zero zero which is impossible for our current technology for a drone to stay on it would never stop trying to align itself which is why you would see the swing motion then I forgot to make the safe zone encompass both negative and positive which is what gave it this weird behavior excitement clouds illogical even the best of developers and with that we were able to get these results yo guys I am happy to report that we have done it check this out I gotta make sure enough loose in my face don't fall don't fall look at me come on alright there it is it gots my face tracks look at that precision look at that precision it is directly on my face I can stand up I can stand up call me call me Mikey come with me come on come on there you go there you go there you go look at that precision I can walk over here I can walk over here look at that look at that look at the streamers look at it look at this follow me oh my goodness gracious come on come on oh my goodness you see me do you see me look at it look at it this is autonomy right here come on come on oh ho my gosh would you look at that would you look at that guys gals and goofs we've done did it give yourself a slap on the forehead I decided to add an offset to the Viets hard to dot why so that we can get better framing without so much Headroom it was just too much then I decided to test the out in the long hallway [Music] then a visitor stopped by and I decided to use him as a guinea pig so it's a bit sensitive it only can tick the frontier face but like if you walk back it'll start to walk with you [Music] and if you walk for it'll do the same [Music] huh [Music] and then like if you duck down the truth it should still we eat your pints of duck down a little slower it lose this oh I still got you well and it worked out pretty good big shots to Brandon and yeah it definitely works better on lighter skin faces now at this point I was incredibly happy I mean we climbed all the way to the top of our ladder and succeeded but there was this deep itch that wanted to try this out in the real world despite all the things that could go wrong in saying as I just needed a computer to connect to the drone I got everything working on my laptop coated in some failsafe and then said what the hell what good is a robot cameraman if you could never use them track track track track track track don't track track track little showing track for the don't try my face little drone face little drunk face my face so the throne face little don't face my face little troll face what went wrong makes my face goes wrong don't say here's what's on here little drones here over here little throne here little Jones mirror over here is overthrowing [Music] [Music] Little Joe alright guys I hope that you guys enjoyed this week's episode and I don't know about you but I think this war has been won here's a salute to all you cadets out there and we're Medal of Honor link to the open source is in the description don't forget to buy a telethon though and real quick when I told IBM I was gonna do a part two they asked me if I would kindly promote the next initiative and seeing as this video wouldn't be possible without them I graciously agreed but heads up this is not a sponsored video it's more of a favor for some folks that I like but I'll look into their next initiative and it turns out to be something really cool for you as well iBM is calling all developers to join their 2019 Coffee Co challenge to compete to win $200,000 if you have the winning solution from free drones to $200,000 IBM's sure knows how to party one example for the challenges from Puerto Rican developer Pedro Cruz who is working on a disaster relief drone system powered by machine learning the drones read symbols from the skies and relay important information back to relief workers and I personally think it's a really cool and inspiring video so links that is in the description but if you want more details or to enter the challenge yourself you can register to enter by heading to IBM dot biz slash car 2019 that's all caps and it's case sensitive and my best wishes told you guys to enter because it sounds like a lot of fun if I have the time and I mean $200,000 is some serious coin but other than that it's time to be please do me a huge favor subscribe to my channel if you haven't already hit that Bell icon to be notified when I upload the next video and also please can follow me on Twitter and Instagram and see like behind the scenes and whatnot it's a fun time I promise and lastly if you're interested in supporting work beyond YouTube please consider becoming a patron on patreon for patronage I just launched this little patreon project called hacks black garden in which all patreon supporters create and customize little hacks for me to add it to my garden so if you're already a patreon supporter please check out the link in the description so that you can create and send me your hack spot please I'm trying to fill this garden and shout out to these top hacks spot patrols for your support all right I'm all done here I hope to see you all next week but whatever the case may be remember to always feed your curiosity [Music] [Music]
Info
Channel: Jabrils
Views: 321,784
Rating: undefined out of 5
Keywords: Jabrils, f(x), computer science, software developer, software engineering, AI, artificial intelligence, AI drone, AI in drone, AI in drones, AI robot, dji tello, watson AI, ibm watson, robot overlords, drone AI, drone swarm, michael reeves, haha michael if you see this, super f(x)
Id: esw88_gKOpA
Channel Id: undefined
Length: 21min 43sec (1303 seconds)
Published: Wed Feb 20 2019
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.