DeepMind’s AlphaStar: A Grandmaster Level StarCraft 2 AI!

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments

Two minute papers is a great educational channel, I suggest anyone who is interested in AI check out. Even if the videos tend to be 5min long.

👍︎︎ 5 👤︎︎ u/PwnBuddy 📅︎︎ Dec 03 2019 🗫︎ replies

Thanks for posting. I wanted to do the same, but you beat me to it. :)

In case you haven't seen it.
My main takeway was the effort DeepMind had to put in to have the AI react properly to various strategies such as cannon rushes. Apparently, deep reinforcement learning systems up to now would learn towards a single strategy and do that over and over again. Such a system can be easily exploited. But AlphaStar can handle this!

I won't spoil more. Just watch the video.

👍︎︎ 3 👤︎︎ u/p_b_omta 📅︎︎ Dec 03 2019 🗫︎ replies
Captions
Dear Fellow Scholars, this is Two Minute Papers with Károly Zsolnai-Fehér. The paper that we are going to cover today in my view, is one of the more important things that happened in AI research lately. In the last few years, we have seen DeepMind’s AI defeat the best Go players in the world, and after OpenAI’s venture in the game of DOTA2, DeepMind embarked on a journey to defeat pro players in Starcraft 2, a real-time strategy game. This is a game that requires a great deal of mechanical skill, split-second decision making and we have imperfect information as we only see what our units can see. A nightmare situation for any AI. The previous version of AlphaStar we covered in this series was able to beat at least mid-grandmaster level players, which is truly remarkable, but, as with every project of this complexity, there were limitations and caveats. In our earlier video, the paper was still pending, and now, it has finally appeared, so my sleepless nights have officially ended, at least for this work, and now, we can look into some more results. One of the limitations of the earlier version was that DeepMind needed to further tune some of the parameters and rules to make sure that the AI and the players play on an even footing. For instance, the camera movement and the number of actions the AI can make per minute has been limited some more and are now more human-like. TLO, a professional StarCraft 2 player noted that this time around, it indeed felt very much like playing another human player. The second limitation was that the AI was only able to play Protoss, which is one of the three races available in the game. This new version can now play all three races, and here you see its MMR ratings, a number that describes the skill level of the AI, and for non-experts, win percentages for each individual race. As you see, it is still the best with Protoss, however, all three races are well over the 99% winrate mark. Absolutely amazing. In this version, there is also more emphasis on self-play, and the goal is to create a learning algorithm that is able to learn how to play really well by playing against previous versions of itself millions and millions of times. This is, again, one of those curious cases where the agents train against themselves in a simulated world, and then, when the final AI was deployed on the official game servers, it played against human players for the very first time. I promise to tell you about the results in a moment, but for now, please note that relying more on self-play is extremely difficult. Let me explain why. Self-play agents have the well-known drawback of forgetting, which means that as they improve, they might forget how to win against a previous version of themselves. Since StarCraft 2 is designed in a way that every unit and strategy has an antidote, we have a rock-paper-scissors kind of situation where the agent plays rock all the time because it encountered a lot of scissors lately. Then, when a lot of papers appear, it will start playing scissors more often, and completely forget about the olden times when the rock was all the rage. And, on and on this circle goes without any real learning or progress. This doesn’t just lead to suboptimal results - this leads to disastrously bad learning, if any learning at all. But it gets even worse. This situation opens up the possibility for an exploiter to take advantage of this information and easily beat these agents. In concrete StarCraft terms, such an exploit could be trying to defeat the AlphaStar AI early by rushing it with workers and warping in photon cannons to their base. This strategy is also known as a cannon rush, and as you can see here the red agent performing this, it can quickly defeat the unsuspecting blue opponent. So, how do we defend against such exploits? DeepMind used a clever idea here, by trying to turn the whole thing around and use these exploits to its advantage. How? Well, they proposed a novel self-play method where they additionally insert these exploiter AIs to expose the main AI’s flaws and create an overall, more knowledgeable and robust agent. So, how did it go? Well, as a result, you can see how the green agent has learned to adapt to this by pulling its worker line and successfully defended the cannon rush of the red AI. This is proper machine learning progress happening right before our eyes. Glorious! This is just one example of using exploiters to create a better main AI, but the training process continually creates newer and newer kinds of exploiters, for instance, you will see in a moment that it later came up with a nasty strategy including attacking the main base with cloaking units. One of the coolest parts of the work, in my opinion, is that this kind of exploitation is a general concept that will surely come useful for completely different test domains as well. We noted earlier that it finally started playing humans for the first time on the official servers. So, how did that go? In my opinion, given the difficulty and the vast search space we have in StarCraft 2, creating a self-learning AI that has the skills of an amateur player is already incredible. But that’s not what happened. Hold on to your papers, because it quickly reached grandmaster level with all three races and ranked above 99.8% of the officially ranked human players. Bravo, DeepMind. Stunning work. Later, it also played Serral, a decorated, world champion Zerg player, one of the most dominant players of our time. I will not spoil the results, especially given there were limitations as Serral wasn’t playing on his equipment, but I will note that Artosis, a well-known and beloved Starcraft player and commentator analyzed these matches and said “The results are so impressive and I really feel like we can learn a lot from it. I would be surprised if a non-human entity could get this good and there was nothing to learn”. His commentary is excellent and is tailored towards people who don’t know anything about the game. He’ll often pause the game and slowly explain what is going on. In these matches, I loved the fact that so many times it makes so many plays that we consider to be very poor and somehow, overall, it still plays outrageously well. It has unit compositions that nobody in their right minds would play. It is kind of like a drunken kung fu master, but in StarCraft 2. Love it. But no more spoilers - I think you should really watch these matches and, of course, I put a link to his analysis videos in the video description. Even though both this video and the paper appears to be laser focused on playing StarCraft 2, it is of utmost importance to note that this is still just a testbed to demonstrate the learning capabilities of this AI. As amazing as it sounds, DeepMind wasn’t just looking to spend millions and millions of dollars on research to just play video games. The building blocks of AlphaStar are meant to be reasonably general, which means that parts of this AI can be reused for other things, for instance, Demis Hassabis mentioned weather prediction and climate modeling as examples. If you take only one thought from this video, let it be this one. There is really so much to talk about, so make sure to head over to the video description, watch the matches and check out the paper as well. The evaluation section is as detailed as it can possibly get. What a time to be alive! Thanks for watching and for your generous support, and I'll see you next time!
Info
Channel: Two Minute Papers
Views: 256,620
Rating: undefined out of 5
Keywords: two minute papers, deep learning, ai, alphastar, deepmind alphastar, alphago, deepmind alphago, starcraft 2, starcraft ii, starcraft ii ai, starcraft 2 ai
Id: jtlrWblOyP4
Channel Id: undefined
Length: 8min 33sec (513 seconds)
Published: Tue Dec 03 2019
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.