Dear Fellow Scholars, this is Two Minute
Papers with Dr. Károly Zsolnai-Fehér. After an incredible day of Apple showcasing their
culmination of many-many amazing research papers, and also light transport algorithms applied to
soundwaves. Today we are going to have a look at NVIDIA’s new invention that can play Minecraft,
but not in the way you think. It truly is a sight to behold as it is able to gather resources for
itself, go and mine, catch a fish, these things I expected. However, what I didn’t expect is
that can even engage in more complex tasks like building a base, or hunting this adorable
little piggy that kind of breaks my heart. Now, it can do 3 more complex things, including
fighting, which I will show to you in a moment. But let’s have a look at the paper, and
immediately when looking at the article, I was like “what?”. Are you sure? You see, this
paper claims to use large language models to play Minecraft. How is that even possible?
Large language models are typically adept at answering text-based questions with text-based
answers. So how does it play a computer game? It doesn’t just play the game, but it plays it
incredibly well. How? Well, hold on to your papers Fellow Scholars, because it is able to build
a curriculum by itself, for itself. You see, it gets some text information about the world
it is in, for instance, the items that it has or what time it is and who is nearby. And based
on this information, it starts reasoning by itself. That is incredible. Look, for instance, if
we have a wooden pickaxe and some stones, then it is a good time to upgrade to a stone pickaxe. Or,
if it’s nighttime and there is a zombie nearby, it is time to grab a sword and shield. Now, these
are simple statements, but the key is that this little AI makes these deductions by itself. We
don’t need to program these by hand, and thus, if it were dropped into a completely different
game, it would likely do quite well there too. But wait a minute, this is still just
text. How does it become gameplay? Well, after creating this curriculum, the AI writes
computer code to control the game and achieve its goals. If this piece of code works, it observes
what it achieves, and then stores it away in a skill library. These will be the building
blocks, for new, more complex skills later. So a text-based chat assistant that
can play a video game. That is very impressive. So, how well can it play the game? With a little help in the form of human
feedback, it can do three amazing things. One, it can build a house. And I have to
say, that is some proper craftsmanship there, or proper craftsAIship. Lovely
house. Good job, little AI! Two, it can also build a nether
portal. This allows it to travel to a different dimension with
unique terrain and resources. And, get this, it can even fight an Enderman.
This creature makes short work of an unsuspecting player, however, the AI came well prepared with
high-quality equipment, and bam, we are done. Now, these all sound good, but we are Fellow
Scholars here, so we are looking for a detailed comparison of this technique against previous AIs.
Is this any better? Oh my goodness. Are you seeing what I am seeing? It can explore so much more than
its predecessors, for instance, much more than the fan favorite AutoGPT, a version of ChatGPT that
can prompt itself and work on its own. And it gets better! Wow! Look at that. I can’t believe
it! It is more than 15 times faster than AutoGPT. AutoGPT has barely come out a few weeks ago, and
it has already been improved 15x already. Wow. Just look at that! What took AutoGPT 75
iterations likely only takes maybe 5 iterations, likely even less. AutoGPT plateaud at iron tools
where the new technique is barely getting warmed up. Incredible improvement in just a couple of
months. These text-based large language models can not only write code to play video games,
but they can even plan their journeys. And just imagine what they will be capable of just two more
papers down the line. What a time to be alive! Thanks for watching and for your generous
support, and I'll see you next time!