AI Learns to DESTROY old CPUs | Mario Kart Wii

Video Statistics and Information

Captions Word Cloud
Reddit Comments
in this video we're going to be teaching our very own AI to play Mario Kart Wii but to really give our AI a challenge it's not just going to be racing alone but instead we're going to be putting it up against the game's very own CPUs for a true showdown of AI power today's arena for this battle is going to be the tribe ghost value 2 which I definitely chose at random and not just because it happens to be one of my favorite tracks this AI is going to be completely self-tall that means that it's going to start off knowing absolutely nothing about Mario Kart the track or even the concept of driving we're just going to let the AI play thousands and thousands of games and we're going to hope that eventually it's gonna learn how to drive to give the AI some guidance during this learning however we're going to give our AIS some rewards as it plays the game so every time it does something good we can give it a reward to encourage that behavior on screen now you can see me playing Mario Kart and on the right you can see the rewards I get for driving around the rewards are calculated with a mixture of how fast I'm going and how through the track I am encouraging me to finish the race as fast as possible so that's how rewards are given but how does an AI actually use these rewards to guide Its Behavior but what that AI is really trying to learn is that from any different position in the game how much reward can it receive by taking different actions from that position for example in this position turning left or going straight are likely to give us very little reward because we would likely crash meaning we wouldn't be going very quickly and we wouldn't make it very far around the track on the other hand if we turned right we're likely to keep our speed higher and we're likely to get further so we'd get a lot more reward when our AI is playing thousands of games all it's really trying to do is get better and better at predicting how much reward it's going to get by using the different actions in different positions so it can choose the action that will give it the most reward AI is training you'll be able to see this little bar chart this shows that for each of the actions the AI can take what its current reward prediction is for the action in this case I gave the AI just five different actions including hard left soft left wheeling going straight forward soft right and hard right you'll probably notice that the action predictions move around a lot while the AI is playing this is because every different position that the AI is in is going to have a different action prediction for and it's going to think it can get different amounts of rewards for each of the different actions but also as the AI is constantly learning it's constantly updating what rewards it thinks it's going to get so it's going to make them move around even more so now that we've got that out the way let's let our AI do some training and we'll check in every now and then to see how it's getting on and how much reward it thinks it's going to be getting so after 30 minutes of training the AI hasn't learned a whole lot yet but is occasionally able to make it around the first corner from the reward predictions we can see that that AI has begun to learn when it's about to die as you see the predictions drastically drop but although it's figured this out it hasn't really figured out how to avoid the dying part yet skipping forward a while by four hours of training the AI is beginning to show glimmers of Hope but it still frequently just dying right at the start although it's figured out that it needs to turn right it still seems to have trouble timing its drift and also seems to get bashed around by the CPUs quite a bit so our AI isn't quite proving its superiority just yet after eight hours the AI is getting more confident and is starting to get around Halfway Around the map pretty consistently at this point using the ramp to do the shortcut is pretty challenging with the AI messing it up about every which way you can imagine but since the shortcut saves so much time on this track the AI keeps trying it since it gives a very large bonus to its reward if it's successful [Music] at 12 hours the AI is looking much more confident with its reward predictions at the beginning of the track being easily twice what they were when it first started training the AI is now rarely dying at the beginning and is usually just struggling with the second half for a while this AI also went for a short period of actually not taking the shortcut because it was so much more confident with the second half of the track even though it was still struggling a little bit it just valued the security of staying alive since it knew it could get a lot of reward just by attempting the second half with all of this put together the AI was eventually finally able to beat its first lap [Music] after 16 hours the AI starts finishing laps pretty consistently and the only thing preventing it from finishing an entire race is just becoming more consistent and patching up some rarer gaps in its knowledge now that the AI complete slaps though we can really see it starting to actually optimize its driving not just caring about staying alive but trying to go faster such as taking Corners Tighter and getting mini turbos for that extra reward [Music] thank you [Music] [Music] after a full day of trading it's really easy to see how much better the AI has gotten and is even able to complete its first phrase taking home the first place position by an absolute Landslide aside from finishing the race though I was particularly happy to see that AI actually started Wheeling before this point on the straight sections of the map the AI just kind of jumped around instead of Wheeling which loses a lot of speed and time but it appears that after about a day of trading it seems to have fixed that problem and is really able to get some extra speed [Music] thank you foreign [Music] so here we are at 48 Hours of training now I don't usually leave the AIS running for quite this long but I had a particularly busy week and I kind of just forgot to turn it off so here we are in fact I actually left it running for a total of 80 hours which is really Overkill to say the least thank you but by 48 Hours the AI was pretty damn good though and it was definitely proving its superiority over the Mario Kart Wii CPUs on screen now you can see the average reward that AI received throughout the 80 hours of training things started off with pretty rapid Improvement but towards the end of trading things did begin to Plateau off a little bit anyway from here on out I'm going to show some of the best Clips as the AI headed towards the full 80 hours of training [Music] thank you [Music] thank you [Music] foreign thank you [Music] thank you all so much for watching I hope this video made your day just a little bit better and be sure to check out some of my other videos If you like this one I've done lots of other stuff like teaching an AI to play laser hockey and Super Mario Bros so be sure to check those out but anyway I hope to see you in the next one
Channel: AI Tango
Views: 1,356,322
Rating: undefined out of 5
Keywords: mario kart, AI, artificial intelligence, Deep Learning, Reinforcement Learning, AI Learns To, AI Plays
Id: VIwGxOdXGfw
Channel Id: undefined
Length: 9min 53sec (593 seconds)
Published: Sat Aug 19 2023
Related Videos
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.