The Game That Learns

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments

If anyone interested https://play.google.com/store/apps/details?id=com.alki.HexaponAi

can also link the source code

👍︎︎ 1 👤︎︎ u/Alki881 📅︎︎ May 21 2019 đź—«︎ replies

Did you use Shrek too?

👍︎︎ 1 👤︎︎ u/gonz_hect 📅︎︎ May 21 2019 đź—«︎ replies
Captions
Vsauce! Kevin here, and I’ve built a computer capable of explaining how you get smarter. Out of these matchboxes, some colorful beads and... Shrek. Real quick, before we get into the game we’re about to play, I wanna tell you about the game that I’ve been playing. I partnered with Raid: Shadow Legends on this video and if you follow me on Instagram you know that I’m a huge fan of RPGs. Well, Raid is the most immersive champion-collecting experience you’ll get on a smartphone. It has a deep story, detailed graphics, giant boss fights, and hundreds of champions to collect and customize. And you can play it free. So to check it out, download Raid using only my link in the description to get 50,000 silver immediately and a free epic champion courtesy of the dev team. Thanks again to them for supporting Vsauce2, go check out their game, it's amazing how far mobile gaming has come, Now let’s get back to the inner workings of our game. Okay. 24 matchboxes, all filled with beads, and covered in potential moves for the game we’re about to play and… this is our computer. Now, Shrek comes along and...wait. How is THIS a computer? Aren’t computers, just like, electronic machines that run software? What IS a computer? Well, the earliest computer was YOU. Or… your ancestors. They used calculating machines like the abacus to input information which output a result but we were the ones computing. The human operators of early calculating machines were literally called “computers.” Okay, back to our matchboxes. Once we introduce a game board, Shrek and the gang, this matchbox and bead setup processes our input, gives us an output and not only that… it also learns. This is not just a computer, this is an artificial intelligence machine capable of matching wits with the brightest minds humanity has to offer. At a game called Hexapawn. Here’s how. Hexapawn is based on chess -- each player has 3 pawns on a board with just 9 squares. The pieces move like chess pawns, too. They can go forward one space if that space is unoccupied, if it is occupied by the opponent then they can’t go forward. Sorry, Donkey. You can, however, move diagonally, but only to take an opponent’s piece. There are three ways to win: Get a pawn to the other side of the board. Take all of your opponent’s pieces. Or leave your opponent without a possible move, like a checkmate in chess. Our setup works like this: I’ve got 24 matchboxes here, and each one corresponds to the position of pieces on the board during that round. I’ve got my Team Kevin pawns vs. the computer’s Team Shrek. And do you know what that means? That means that we’ve officially turned Hexapawn into: Shreksapawn. Alright. Let’s play. The human, that's me, always goes first. Wait. Why? Because recreational mathematician Martin Gardner said so. He actually created Hexapawn and its rules as a simplified version of a 304-matchbox computer called MENACE. 15 years after helping the British break the Nazis' codes in World War II, Donald Michie invented MENACE to learn how to master Tic-Tac-Toe. And now 59 years later, I’m on YouTube playing Shreksapawn. Since I go first, my moves occur in only the odd-numbered rounds. 1, 3, 5 and 7. Therefore, the matchboxes are grouped by possible Team Shrek moves in rounds 2, 4, and 6. One of us is guaranteed to win before Round 8. So, Team Shrek has no Round 8 moves. Each box contains one colored bead for each potential move on that board position. So like this first box has a green, a blue, and a purple. And I've cut a hole at the bottom of the box that will only allow one bead at a time to fall out. So I’ll just shake the box and let one bead out. And it's purple. That means if my pawn was here and it was Team Shrek's move, Team Shrek would make the purple arrow move. Like this. If a blue bead had fallen out, then Team Shrek would've made the blue arrow move. And if it was a Green Bead then Team Shrek would've made the green arrow move. And taken my pawn. Okay so that's how Team Shrek will move. Team Kevin will move however I want Team Kevin to move because I'm Kevin and then we’ll play back and forth until there’s a winner. Alright, Round : Fight!: I decide to move Lord Farquaad forward. For Round 2 I now use. this box to determine Team Shrek's move. So we'll give it a shake. Woah! Let's try that again. And it's the green move. So Donkey moves forward. Now it's my turn and I decide that, look, I can just take Princess Fiona when I move diagonally and win the game. That's it. Now here’s the important part. When Team Shrek makes a losing move, I remove that bead from the box. That way the computer can’t make the same bad move the next time that this situation comes up. By removing its losing beads, the computer learns to play better. When Team Shrek does win, then instead of removing the bead I'll just put the bead back in the box. Okay, I’m gonna play a bunch of rounds now and I’ll keep track of wins over here, with a K when I win and I'll write an S for a Shrek win. Here we go. Okay, I’ve played 14 games. I started off winning a lot more than I was losing… and then things changed. Out of the last 7 games, Team Shrek has won 6 of them. The computer is clearly getting better at the game… but is it really learning? I mean, I’m just taking beads out of matchboxes how is that learning? What is learning? At the most basic level, learning is acquiring new knowledge or a new skill, or modifying an existing behavior. Every time I take a bead out of a matchbox, the computer loses a behavior that leads it to an outcome of failure. That increases the probability that the computer’s move each round leads it to success -- which in our case, is winning Shreksapawn. After a sufficient number of games, the computer will evolve to play perfectly. My Team Shrek computer may not be thinking on its own, but it is learning. And it can also learn in a different way. Removing beads is basically a form of learning by punishment. When Team Shrek makes a bad move, I’m punishing the computer for being wrong. I don’t have to worry about the computer feeling bad about losing, these matchboxes aren’t gonna get frustrated and quit playing and run away crying and slam the door in my face and tell me I’m not their real dad. But what happens if instead of punishing my computer, I reward it? Instead of just putting the good play bead back in the box when the computer wins, I could add another bead of the same color that made the winning move. That would reduce the probability of a losing bead appearing by increasing the probability of a matchbox generating a winning bead. The computer would still eventually reach perfect play because I'll still remove the losing beads, but it will take longer because it's winning more often. If it could feel, it would probably feel better about winning more often along its longer journey toward perfection. So the fastest way to perfect play is by punishing the computer’s mistakes. But the way to win as many games as possible along the way is to reward its victories. To improve at hexapawn, our matchbox computer actually uses a type of genetic algorithm. It’s a way to solve problems and learn based on natural selection. Based on the process that drives biological evolution. The beads of learning in your life may be refined by punishment. Put your hand on a hot stove once, and learn that, “Ow! That’s painful.” So you remove the touch-hot-stove-bead from your brain. They may also be augmented by rewards. “My parents bought me ice cream for getting an A on my exam.” Add another get-good-grades-bead to your matchbox head computer. Hexapawn is an obscure, academic game from over 50 years ago, and you can make a matchbox computer that learns to win every time. But by allowing this matchbox computer full of colored beads to learn, the player who’s learning a bit more about learning is… you. And as always, thanks for watching. If you wanna make your own matchbox, oh I lost a bead, matchbox computer, download my template for free over at Twitter.com/VsauceTwo. That's at Vsauce T, W, O. If you wanna watch more Vsauce2 videos, just uh click over here, and if you aren't subscribed to Vsauce2 then maybe you should uh, put a, "subscribe to Vsauce2" bead in your brain. Wow. That was weirdly creepy.
Info
Channel: undefined
Views: 1,532,581
Rating: 4.9040418 out of 5
Keywords: vsauce, vsauce2, vsause, vsause2, vsauce 2, mind blow, missing dollar riddle, what is a paradox, vsauce2 paradox, math games, math games for kids, hexapawn, chess game play, artificial intelligence, history of computers, matchbox computer, genetic algorithm, genetic algorithm in artificial intelligence, the game you can never win, the game you can always win, game you win by losing, birds in a truck riddle, parrondo’s paradox, demonetization game, can being stupid make you smart
Id: sw7UAZNgGg8
Channel Id: undefined
Length: 12min 1sec (721 seconds)
Published: Mon Mar 18 2019
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.