So Someones Teaching An AI How To Nuzlocke...

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
[Music] all right whose Nuzlocke am I tearing to shreds today the answer is actually nobody's this footage was actually of an AI attempting to play through an emerald kaiso hardcore Nuzlocke it's a project created by KP Labs a Pokémon Enthusiast who was getting interested in Ai and decided to take this on as his first AI related project obviously this thing isn't quite ready to take my job yet but in this run it won 12 trainer battles and that's something the trainer battles in Emerald kaiso one of the hardest ROM hacks out there are designed to give you trouble from the very start even some of the best runners in the world have lost runs before the fight against red that took out the AI here so what actually is this Ai and how does it work currently the AI is just a battle simulator team building is a whole other can of worms for now Random Encounters for each attempt are simulated automatically by the software and save states are loaded before each mandatory Battle of the rom hack once in the battle that's where the AI actually goes to work eventually KP Labs wants to integrate more advanced team building and there's a world where Overworld movement can eventually be incorporated as well in late 2023 Peter Whitten showed off an AI that was capable of playing Pokemon Red up through the mount Moon cave but not before it spent a lot of time bumping into walls and walking around in circles for now because nuzlock rules add so much complexity and because even the easier battles in Emerald kaiso are leagues of difficulty harder than Pokemon Reds KP Labs is trying to keep the focus of the AI on battling for now ai can mean a lot of different things and this is not the kind that's been screwing up your Google search results recently so let's be a little bit more specific about what KP Labs is actually doing here AIS have been playing games for a long time now IBM's chess playing AI deep blue became a global sensation after what a match against the world champion Gary Kasparov in 1997 deep blue didn't really learned though and it didn't make up new and Brilliant moves it's more that it had a really big memory and incredible processor speed allowing it to take databases full of games and positions and look through them fast enough to pull up the best known move for any given situation experts call this a table lookup approach that approach to the problem isn't going to work for a Nuzlocke for one huge reason Randomness any given Pokemon turn is already loaded with possible choices that can create branching pass the player can pick one of four moves or switch to any of its five teammates but within those four moves are accurate checks critical hit checks damage rolls status checks item checks it gets pretty crazy and that's before even acknowledging individual Pokemon variations like IVs and abilities the third build a computer scientist who tackled this problem with competitive Pokémon battles in 2022 did the calculation turn one alone is more complicated than a full game of chess by itself which is still not as complicated as finding the right streaming service in your country for that one show you've been trying to watch luckily the sponsor of this video nordvpn has got you covered for that Nord lets you switch between hundreds of servers and dozens of countries and unlock the full library of any streaming service by overcoming geol loocking it's also been really helpful for me personally in booking flights hotels and rental cars because prices of these things are often dynamically calculated based on your location switching my location with nordvpn has allowed me to save real actual money on trips that way it also means that I won't have to worry about Network admins on public Wi-Fi network seeing what websites I'm visiting when I'm on the go Nord has a 30-day money back guarantee and 24/7 customer support so there's no reason not to try it right now thanks to Nord for sponsoring this video Pokémon is not the first game to present this problem to AI developers compared to both Pokemon and chess back gamon is a simple game but applying the table lookup approach that took down Kasparov didn't really work for B gamon Gerald tsaro built two of the best Early B gamon playing AIS in the 1980s and90s neurog gamon and TD gamon neurog gamon was an award-winning program but peaked at a level that tsaro himself described as intermediate he wrote In 1995 programming a computer to play highle B gamon has been found to be a rather difficult undertaking in certain ified endgame situations it is possible to design a program that plays perfectly via table lookup however such an approach is not feasible for the full game due to the enormous number of possible States estimated over 10 to the power of 20 furthermore The Brute Force methodology of deep searches which has worked so well in games such as chess Checkers and aell is not feasible due to the high branching ratio resulting from the probabilistic dice rolls see AI that uses the table lookup method have a dirty little secret they're incapable of creativity if there's no table to look up they're useless Taro instead went with a different approach the same approach KP Labs is using to try and beat Emerald kaiso reinforcement learning reinforcement learning models start from absolute zero rather than using something like a database that calculate an optimal play these models learn from experience for Pokémon that looks like giving the AI its possible choices moves and switches as actions it can take the ai's behavior is then positively or negatively reinforced based on the reward structure given to it by the programmer after every action it takes it updates itself based on the reward received this means that if you look at the AI play early on in the training process it's going to look really really stupid before the AI has received enough rewards to influence its decision- making all it can do is make its choices at random and since most pokemon don't have full move sets early on in the game chances are that random selection lands on a switch leading to runs where it just keeps switching until everything dies but eventually it'll start learning after getting positive rewards for stuff like taking Kos and winning battles and negative rewards for losing Pokémon it'll eventually figure out such radical concepts as doing damage is important and letting your Pokemon get hit too much is bad and maybe eventually they'll even play around the crit tweaking the reward function has been one of the big challenges for KP Labs it will end up learning to play for the rewards that you set you know sort of inevitably so you do need to be careful there was an early early version of this from a few months ago uh where I was messing around with it and what it would do is it would it would stall the battle like as much as possible in order to keep earning points like when it it knew it had a One-Shot kill but it would just keep switching using a move chipping some damage keep switching it's a science to it when something like that happens or if it seemed like the AI was hitting a wall and KP Labs wanted to give it more context it would have to be reset lose everything and it learned and start all over again from spamming random moves and switches but that's all part of the process I was uh keeping an eye on the graphs and it uh looked like it was stagnating whether or not that was just due inherently to the conclusions that it had formed I'm not sure most of machine learning like this is throwing at the wall until something sticks watching its current PB you can see that a few things have stuck against the bug types in the early game the AI will reliably switch in the flying type farfetched and more often the fire type slugman once they had the sweep the AI was usually happy to leave them in and just keep clicking the super effective damaging move until the opponent died but it still makes its share of mistakes the biggest in the PB was letting the starter Mudkip died to cit's Quick attack in the very first fight right now the AI doesn't have info like the opponent's move set although that's certainly something that could be added as an input as the system is improved but even with that information the AI still would need to experience the mistake and receive the negative reward probably many times over before it can update itself and adapt it also doesn't necessarily know why things are happening against the magma grunt that nearly ended this run we see some fascinating behavior from the AI with the paralyzed Slugma facing off against the coughing the coughing is in the red and will die to another flame wheel the AI goes for it but it gets fully paralyzed it tries to click rock throw but a full paralysis happens again then it switches out to the utterly useless poen which just mashes sand attack instead of recognizing the odds were in its favor it just kept attacking it panicked and switched okay we're anthropomorphizing the computer a little bit here but can you see the similarities to how new players approach Pokémon unlike the AI and its newborn State humans usually have enough of an understanding of how video games work coming into it that their default behavior is spamming attacks instead of switches but we've all seen or been the player who reaches a fight where just mashing attacks won't work and they Panic frantically trying other options until one works or more likely they die right now even at its best the AI is still more or less in that stage how much better can it really become a funny thing happened with the early back gam and AI attempts AIS that tried to mimic the way humans evaluated board States always seem to come up short against human opponents because the game has so many possibilities players have to come up with heris rules that are Loosely defined enough that they can be adapted to the various RNG situations the game can present computers famously are not great with Loosely defined and then there was the problem with the fact that commonly accepted optimal strategies were constantly being proven wrong Taro concluded in view of this programmers are not exactly on firm grounds and accepting current expert opinions at face value very diplomatic it seems like this was what excited him the most about reinforcement learning the model It produced wouldn't be tainted by the biases and errors of the humans other models had attempted to mimic his next model TD gamon would train itself from Nothing by playing games against itself and adjusting to a reward function like KP lab's Pokémon trainer TD gamon essentially played randomly at first but after thousands of games Taro reported seeing basic plays like hitting the opponent and playing safe and more sophisticated strategies started emerging after tens of thousands of games how good did it get at the time the best commercial back gam and programs would lose to Master Level human players by about 23s of a point per game in 1991 TD gamon 1.0 was tested in 51 games against a trio of Masters losing by 13 combined points over 51 games just 0.25 points per game A year later with another million and a half training games under its belt it got another chance taking on one of those Masters in a 40 game series The Master Bill Roberty won by exactly one point as close to dead even as you can get what was so remarkable wasn't that a computer was playing a game at an equivalent level to the world's best but how it was doing it tasara wrote that Roberty thinks that at least in a few cases the program has come up with a genuinely novel strategy that actually improves on the way top humans usually play another top back gam and player and analyst kit Wooley went even further instead of a dumb machine which can calculate things much faster than humans such as the chess playe computers you've built a smart machine which learns from experience pretty much the same way humans do TD gam actually had an immediate effect on tournament Back gamon Games thanks to an article Robert he wrote in back gamon magazine about his experience playing against it he included an analysis where the AI suggested moving the back Checkers in certain situations as opposed to the front ones soon the decades old traditional play of moving the front Checker's head all but disappeared from tournament play replaced by the play favored by TD gamon this to me is the most exciting part of the KP Labs project the idea of a program that can beat a Nuzlocke is one thing the idea of a program that can learn how to beat a Nuzlocke is another thing entirely one that has so many possibilities to teach us more about this hobby which I've devoted so much of my life to I take pride in my ability with this game the way I can combine my knowledge with intuition to come up with some of the most insane Cooks you'll ever see I think I can find lines that nobody else can but like anybody I have my biases and my blind spots and unlike a computer I can only take on so many runs my 150 attempts at Emerald kaiso took me months and months the KP Labs AI could do 150 attempts in a matter of days even if the AI doesn't uncover new or better strategies like TD gam did the fact that it plays through the game so fast could open up possibilities for Nuzlocke experiments that we rarely get to see unless you're playing Crystal kaiso plus which has emy's fantastic calculator and all the data that comes with it most nuzlock players don't have substantial answers to questions like which encounter is actually best on this route for a game like Emerald kaiso or run and Bun we have pretty good ideas based on how people have won in the past but Gathering the sample sizes to scientifically test them is pretty much impossible an AI that could play the game at a high level could give us some real data to help us answer these questions one of the coolest aspects of these kinds of learning models is that you can actually see them adapting in real time there was one particular proud moment where the AI learned about how sleep worked the first time it hit sing it immediately clicked sing again because it was rewarded for hitting sing the first time but eventually learned to stop clicking status moves multiple times in a row seeing theyi learn how to play around the Sleep mechanic was one of the the moments that really show the potential of this project it's a small step but imagine what could happen with more training and more refinement for the rewards function the model has come from nothing and learned how to win Pokémon battles it's beaten more trainers in Emerald kaiso than some human players there's still a long time before it's competing for the title of best nuzlock in the world or anything but I think there are some pretty sick possibilities that could emerge from KP lab's project and I can't wait to see where it goes from here
Info
Channel: pChal
Views: 117,797
Rating: undefined out of 5
Keywords: twitch, games, pokemon, gaming, pokemon challenges, challenges, pokemon challenge, pokemonchallenges, pchal, nuzlocke, pro nuzlocker, jaiden animations, smallant, ludwig
Id: uJoB6Oa5Ewk
Channel Id: undefined
Length: 12min 24sec (744 seconds)
Published: Thu May 30 2024
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.