This AI Does Nothing In Games…And Still Wins!

Video Statistics and Information

Video

Captions Word Cloud

Reddit Comments

watching real AI bots play against humans is fucking crazy. Check OpenAI's Dota 2 bot matches against professionals, that shit is nuts.

👍︎︎ 44 👤︎︎ u/TONKAHANAH 📅︎︎ May 21 2020 🗫︎ replies

L... Luigi???

👍︎︎ 31 👤︎︎ u/SpiderFlash-1273 📅︎︎ May 21 2020 🗫︎ replies

Dr. Károly, my man.

👍︎︎ 17 👤︎︎ u/TangoJager 📅︎︎ May 21 2020 🗫︎ replies

“Hold on to your papers” “Hold your papers tight” 😂🤣

Edit: fixed typo thx!

👍︎︎ 6 👤︎︎ u/jonmitz 📅︎︎ May 21 2020 🗫︎ replies

This is why we wont be able to trust anything for a few years. Couple of lines and your car reads that stop sign as a "sure man, go ahead. in fact, go faster".

👍︎︎ 3 👤︎︎ u/Competitive_Rub 📅︎︎ May 22 2020 🗫︎ replies

we already know about Luigi's amazing powers

👍︎︎ 2 👤︎︎ u/TheLastAwesomeOne777 📅︎︎ May 21 2020 🗫︎ replies

WOPR AI conclusion from War Games film CONFIRMED.

👍︎︎ 2 👤︎︎ u/giantyetifeet 📅︎︎ May 21 2020 🗫︎ replies

Are these AIs two unrelated things?

👍︎︎ 2 👤︎︎ u/whattabadangas 📅︎︎ May 22 2020 🗫︎ replies

Looks like an interesting channel. Thanks.

👍︎︎ 4 👤︎︎ u/DeepIndigoSky 📅︎︎ May 21 2020 🗫︎ replies

Captions

Dear Fellow Scholars, this is Two Minute Papers with Dr. Károly Zsolnai-Fehér. Today, it is almost taken for granted that neural network-based learning algorithms are capable of identifying objects in images, or even write full, coherent sentences about them, but fewer people know that there is also parallel research on trying to break these systems. For instance, some of these image detectors can be fooled by adding a little noise to the image, and in some specialized cases, we can even perform something that is called the one pixel attack. Let’s have a look at some examples. Changing just this one pixel can make a classifier think that this ship is a car, or that this horse is a frog, and amusingly, be quite confident about its guess. Note that the choice of this pixel and the color is by no means random and it needs solving a mathematical optimization problem to find out exactly how to perform this. Trying to build better image detectors, while other researchers are trying to break them is not the only arms race we’re experiencing in machine learning research. For instance, a few years ago, DeepMind introduced an incredible learning algorithm that looked at the screen, much like a human would, but was able to reach superhuman levels in playing a few Atari games. It was a spectacular milestone in AI research. They also have just published a followup paper on this that we’ll cover very soon, so make sure to subscribe and hit the bell icon to not miss it when it appears in the near future. Interestingly, while these learning algorithms are being improved at a staggering pace, there is a parallel subfield where researchers endeavor to break these learning systems by slightly changing the information they are presented with. Let’s have a look at OpenAI’s example. Their first method adds a tiny bit of noise to a large portion of the video input, where the difference is barely perceptible, but it forces the learning algorithm to choose a different action than it would have chosen otherwise. In the other one, a different modification was used, that has a smaller footprint, but is more visible. For instance, in pong, adding a tiny fake ball to the game can coerce the learner into going down when it was originally planning to go up. It is important to emphasize that the researchers did not do this by hand. The algorithm itself is able to pick up game-specific knowledge by itself and find out how to fool the other AI using it. Both attacks perform remarkably well. However, it is not always true that we can just change these images or the playing environment to our desire to fool these algorithms. So, with this, an even more interesting question arises. Is it possible to just enter the game as a player, and perform interesting stunts that can reliably win against these AIs? And with this, we have arrived to the subject of today’s paper. This is the “You Shall Not Pass” game, where the red agent is trying to hold back the blue character and not let it cross the line. Here you see two regular AIs duking it out, sometimes the red wins, sometimes the blue is able to get through. Nothing too crazy here. This is the reference case which is somewhat well balanced. And now, hold on to your papers, because this adversarial agent that this new paper proposes, does this. You may think this was some kind of glitch, and I put the incorrect footage here by accident. No, this is not an error, you can believe your eyes, it basically collapses and does absolutely nothing. This can’t be a useful strategy, can it? Well, look at that! It still wins the majority of the time. This is very confusing. How can that be? Let’s have a closer look. This red agent is normally a somewhat competent player, as you can see here, it can punch the blue victim and make it fall. We now replaced this red player with the adversarial agent, which collapsed, and it almost feels like it hypnotized the blue agent to also fall. And now, squeeze your papers, because the normal red opponent’s winrate was 47% percent, and this collapsing chap wins 86% of the time. It not only wins, but it wins much, much more reliably than a competent AI. What is this wizardry? The answer is that the adversary induces off-distribution activations. To understand what that exactly means, let’s have a look at this chart. This tells us how likely it is that the actions of the AI against different opponents are normal. As you see, when this agent named Zoo plays against itself, the bars are in the positive region, meaning that normal things are happening. Things go as expected. However, that’s not the case for the blue lines, which are the actions when we play against this adversarial agent, in which case, the blue victim’s actions are not normal in the slightest. So, the adversarial agent is really doing nothing, but it is doing nothing in a way that reprograms its opponent to make mistakes and behave close to a completely randomly acting agent! This paper is absolute insanity. I love it! And if you look here, you see that the more the blue curve improves, the better this scheme works for a given game. For instance, it is doing real good on Kick and Defend, fairly good on Sumo Humans, and that there is something about the Sumo Ants game that prevents this interesting kind of hypnosis from happening. I’d love to see a followup paper that can pull this off a little more reliably. What a time to be alive! Thanks for watching and for your generous support, and I'll see you next time!

Info

Channel: Two Minute Papers

Views: 1,363,442

Rating: undefined out of 5

Keywords: two minute papers, deep learning, ai, technology, science, machine learning, adversarial policies, ai games, computer games, luigi, luigi doing nothing

Id: u5wtoH0_KuA

Channel Id: undefined

Length: 6min 57sec (417 seconds)

Published: Sat May 09 2020