Dear Fellow Scholars, this is Two Minute Papers
with Dr. Károly Zsolnai-Fehér. Today we are going to see an AI learn boxing
and even mimic gorillas during this process. Now, in an earlier work, we saw a few examples
of AI agents playing two-player sports, for instance, this is the “You Shall Not Pass”
game, where the red agent is trying to hold back the blue character and not let it cross
the line. Here you see two regular AIs duking it out,
sometimes the red wins, sometimes the blue is able to get through. Nothing too crazy here. Until…this happens. Look. What is happening? It seems that this agent started to do nothing…and
still won. Not only that, but it suddenly started winning
almost all the games. How is this even possible? Well, what the agent did is perhaps the AI
equivalent of hypnotizing the opponent, if you will. The more rigorous term for this is that it
induces off-distribution activations in its opponent. This adversarial agent is really doing nothing,
but that’s not enough - it is doing nothing in a way that reprograms its opponent to make
mistakes and behave close to a completely randomly acting agent! Now, this new paper showcases AI agents that
can learn boxing. The AI is asked to control these joint-actuated
characters which are embedded in a physics simulation. Well, that is quite a challenge - look, for
quite a while after 130 million steps of training, it cannot even hold it together. And, yes…these folks collapse. But this is not the good kind of hypnotic
adversarial collapsing. I am afraid, this is just passing out without
any particular benefits. That was quite a bit of training, and all
this for nearly nothing. Right? Well, maybe…let’s see what they did after
200 million training steps. Look! They can not only hold it together, but they
have a little footwork going on, and can circle each other and try to take the middle of the
ring. Improvements. Good. But this is not dancing practice, this is
boxing. I would really like to see some boxing today
and it doesn’t seem to happen. Until we wait for a little longer…which
is 250 million training steps. Now, is this boxing? Not quite, this is more like two drunkards
trying to duke it out, where neither of them knows how to throw a real punch…but! Their gloves are starting to touch the opponent,
and they start getting rewards for it. What does that mean for an intelligent agent? Well, it means that over time, it will learn
to do that a little better. And hold on to your papers and see what they
do after 420 million steps. Oh wow! Look at that! I am seeing some punches, and not only that,
but I also see some body and head movement to evade the punches, very cool. And if we keep going for longer, whoa! These guys can fight! They now learned to perform feints, jabs,
and have some proper knockout power too. And if you have been holding on to your papers,
now, squeeze that paper, because all they looked at before starting the training was
90 seconds of motion capture data. This is a general framework that also works
for fencing as well. Look! The agents learned to lunge, deflect, evade
attacks, and more. Absolutely amazing. What a time to be alive! So, this was approximately a billion training
steps, right. So how long did that take to compute? It took approximately a week. And, you know what’s coming. Of course, we invoke the First Law Of Papers,
which says that research is a process. Do not look at where we are, look at where
we will be two more papers down the line. And two more papers down the line, I bet this
will be possible in a matter of hours. This is the part with the gorillas. It is also interesting that even though there
were plenty of reasons to, the researchers didn’t quit after a 130 million steps. They just kept on going, and eventually, succeeded. Especially in the presence of not so trivial
training curves where the blocking of the other player can worsen the performance, and
it’s often not as easy to tell where we are. That is a great life lesson right there. Thanks for watching and for your generous
support, and I'll see you next time!
why does my baby not come out of the womb a gold medalist boxing champ?