AI Invents New Bowling Techniques

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
today we're going to do something a little different in the last video we built the Spider-Man AI it was agile it was floppy and it had a pretty decent amount of brain cells but most importantly it used this algorithm called PPO this algorithm is beautiful and it'll be ashamed to only use it once on this channel so I'm going to use it again hey don't worry I'm not going to explain it again you can go watch the Spider-Man video if you want to see that all we're really doing in this video is having some fun with it so picture this you're in a bowling alley with an old fish tank and a questionable menu and you're wondering how you're going to knock over all 90 of these pins you only have to knock over the 10 in the middle says soggy waffles oh that's easy but then you remember I forgot how to walk or do anything for that matter alright this is a big problem but luckily you have me now so let's go through this together first up let's check your body so you're a rag doll and you have 12 joints and 13 bones that's a little less than usual but it should be fine all the essentials are there you also appear to have round feet look extremely painful but maybe it helps alright what about the measurements well you're six feet tall and about 85 kilos and all of your body parts have the correct weight so theoretically you should flop like a normal person okay what about your muscles I can see you have an abnormal amount of neck strength so let me fix that real quick other than that everything looks good except it's not good because that means the problem is mental and as far as I know Unity doesn't have a brain scan option so there's really nothing to be done here except rebuild the coordination from the ground up so let's get to it since we're using a reinforced and learning algorithm we're gonna have to define a reward function this will give the AI incentive to behave the way we want it to now I've done a bit of bowling in my life and I've never managed to score points without keeping the ball in the lane so we should only give out rewards if the AI manages to keep the ball within a specific range we could also punish it for throwing the ball outside of that range but it's 2023 and someone might try and cancel me for that so let's just not another good idea would be to reward the AI for getting the ball to travel down the alley every frame we could look at the speed of the ball going forward and give out a reward proportional to that this will make the reward we get for a throw scale with the distance the ball travels forward which is pretty good an even better idea would be to add an exponent to the speed reward this means that a ball that is thrown faster we'll get more Total Rewards than a slower one it also makes the negative Rewards or punishments imaginary which may just save my ass in the future and just for good measure let's give the AI a reward proportional to its head's y coordinate to encourage it to stay as high as possible the four components we just described should be sufficient enough to guide the AI towards hitting the pins so what now well we still have to define the interface for the AI so let's do that basically for every joint we will tell the AI its position velocity angular velocity and the angle is pointing at and we will give it control over the angle it tries to point towards there may also be more than one degree of Freedom so in some cases there will be two or three angles to control and just for good measure or even let the AI decide when it releases the ball because that'll probably help that about sums it up for the interface and reward system we've now completed about five percent of our job and it's now time to do the other 95 waiting for something to happen so let's kick back for a minute or two and see what this AI comes up with foreign [Music] [Music] [Music] interesting results there in his first training session it basically just prioritized standing up rather than actually Bowling so in other words a complete failure and I was forced to restart it in the second session he'd actually managed to make decent progress and knock down a few pins you can see it's learned how to cast a spell like a witch to get the ball to go straight I was actually going to try this technique in real life but apparently the ritual requires the victim to have a seizure so I was once again forced to restart training in the final session we can see another truly amazing technique using the elasticity in your spine and legs to launch the ball three lanes to your left while you rearrange your face on the wooden floor this technique is absolutely brilliant and I'm surprised the pros aren't already using it so in summary two-step jazz hands and rubber band didn't really have much to offer here while they were all able to increase the rewards they received none of them learned how to bulk consistently they each got stuck in something called local Optima rather than maximizing the overall objective of bowling fast and straight each one learned how to maximize a single characteristic of the reward function this is what makes reinforcement learning so tricky the more complex the objective gets the more likely the AI is to settle on something sub-optimal to solve this we need to make some tweaks to our original reward function first we'll reduce the reward for staying upright this should discourage another two-step from emerging next we will need to punish the ball for moving horizontally this should help the AI achieve greater accuracy by guiding it towards a straight throw more aggressively finally we will need to put a cap on the exponential speed reward this term can ground a control really quickly and rubber band was abusing this by simply flinging the ball as high as it could regardless of accuracy these three tweaks should enable the AI to move towards a much better solution to our problem so what comes next well you've guessed it we wait see you in a bit foreign [Music] [Music] [Music] ing session was quite short but it was very effective the new reward system worked like a charm and it has produced an AI that not only Falls straight but is capable of getting strikes too self-preservation is still a pipe dream but hey this is about bowling not injury prevention now I could easily end this video here you know like subscribe see you next time bye my only fans but I'm going to do a little extra you see while the bowler is quite efficient it there's no knowledge of the pins so I can't actually aim on top of that real Bowlers have control over other stuff such as spin so we'll need to add these things in this is problematic we just trained a neural network to bolster it and now we need more inputs and outputs so we can't even reuse all that work we just did this sucks so instead of doing that I'm just going to wing it and perform open brain surgery in theory I could stick the extra input and output neurons into the network and just stitch in all the extra weights this will alter the behavior of the network but if I make the new weights weak enough then the original signal should stay intact then with the little retraining the AI should be able to incorporate the new inputs and outputs and learn how to bowl better and just for good measure we can give out extra rewards for knocking the pins over because I definitely didn't forget to include that until now now that all sounds good in theory but does it work well there's only one way to find out [Music] thank you foreign thank you she EP thank you foreign
Info
Channel: b2studios
Views: 3,280,855
Rating: undefined out of 5
Keywords: wii sports, wii sports ai, ten pin bowling, ten pin bowling ai, ai bowling, ai ten pin bowling, bowling ai, reinforcement learning, bolwing ai, bowling aoi, b2studios bowling, b2ustiods, b2suitiods
Id: EWjUY_3ubf4
Channel Id: undefined
Length: 11min 33sec (693 seconds)
Published: Thu May 11 2023
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.