Today I want to share with you a neat way
to solve the towers of Hanoi puzzle just by counting in a different number system, and surprisingly this stuff relates to
finding a curve that fills Sierpinski triangle. I learned about this from a former CS lecturer of mine, his name is Keith Schwarz. And I've got to say, this man is one of
the best educators that I've ever met. I actually recorded a bit of the conversation
where he showed me this stuff, so you guys can hear some of
what he described directly. It's weird, I'm not normally the sort of person
who likes little puzzles and games, but I just love looking at the analysis
of puzzles and games, and I love just looking at mathematical patterns and (ask): where does that come from? In case you aren't unfamiliar, let's just lay down what the towers
of Hanoi puzzle actually is. So you have a collection of three pegs,
and you have these discs of descending size. You think of these disks as having a hole in the middle, so that you can fit them onto a peg. The set I pictured here has five discs,
which I'll label 0, 1, 2, 3, 4, but in principle you could have as
many discs as you want. So they all start up stacked up from
biggest to smallest on one spindle, and the goal is to move the entire
tower from one spindle to another. The rule is you can only move
one disc at a time, and you can't move a bigger disc
on top of a smaller disk. For example, your first movemust involve moving disk 0, since any other disk has stuff on top of it that needs to get out of the way before it can move. After that, you can move disc 1, but it has to go on whatever peg doesn't currently have disk 0, since otherwise you'll be putting a bigger disk on a smaller one, which isn't allowed. If you've never seen this before, I highly encourage you to pause and pull out some books of varying sizes and try it out for yourself, just kind of get a feel for what the puzzle is: if it's hard, why it's hard, if it's not, why it's not, that kind of stuff. Now, Keith showed me something truly surprising about this puzzle, which is that you can solve it just by counting up in binary and associating the rhythm of that counting with a certain rhythm of disc movements. For anyone unfamiliar with binary, I'm going to take a moment to do a quick overview here first. Actually, even if you are familiar with binary, I want to explain it with a focus on the rhythm of counting, which you may or may not have thought about before. Any description of binary typically starts off with an introspection about our usual way to represent numbers - what we call base-10 - since we use ten separate digits, 0123456789. The rhythm of counting begins by walking through all ten of these digits, Then, having run out of new digits, you express the next number, ten, with two digits: 10. You say that one is in the tens place, since it's meant to encapsulate the group of 10 that you've already counted up to so far, while freeing the ones place to reset to zero. The rhythm of counting repeats like this: counting up nine, rolling over to the tens place, counting up nine more, rolling over to the tens place, etc. Until after repeating that process nine times, you roll over to a hundreds place, a digital that keeps track of how many groups of 100 you've hit, freeing up the other two digits to reset to zero. In this way, the rhythm of counting is kind of self-similar: even if you zoom out to a larger scale, the process looks like doing something, rolling over, doing that same thing, rolling over, and repeat nine times before an even larger roll over. In binary, also known as base-2, you limit yourself to two digits, 0 and 1, commonly called 'bits', which is short for binary digits. The result is that when you're counting, you have to roll over all the time. After counting 01 you've already run out of bits, so you need to roll over to a two's place, writing '10' and resisting every urge in your base-10-trained brain to read this as ten, and instead understand it to mean one group of 2 plus 0. Then, increment up to 11, which represents three, and already you have to roll over again, and since there's a one in that two's place, that has to roll over as well, giving you 100, which represents one group of four plus 0 groups of two plus 0, in the same way that digits in base-10 represent powers of 10, bits in base-two represent different powers of 2. So instead of talking about a ten's place, a hundred's place, a thousand's place, things like that, you talk about a two's place, a four's place and an eight's place. The rhythm of counting is now a lot faster, but that almost makes it more noticeable: Flip the last, roll over once. Flip the last, roll over twice. Flip the last, roll over once. Flip the last, roll over three times. Again, there's a certain self-similarity to this pattern: at every scale the process is to do something, roll over, then do that same thing again. At the small-scale, say counting up to three, which is 11 in binary, this means flip the last bit, roll over to the two's, then flip the last bit. At a larger scale, like counting up to fifteen, which is 1111 in binary, the process is to let the last three count up to seven, roll over to the eight's place, then let the last three bits count up again. Counting up to 255, which is eight successive ones, this looks like letting the last seven bits count up till they're full, rolling over to the 128's place, then letting the last seven bits count up again. Alright, so with that mini introduction, the surprising fact that Keith showed me is that we can use this rhythm to solve the towers of Hanoi. You start by counting from zero. Whenever you're only flipping that last bit from a 0 to a 1, move disc 0 one peg to the right. If it was already on the rightmost peg, you just loop it back to the first peg. If in your binary counting, you roll over once to the two's place, meaning you flip the last two bits, you move disc number 1. "Where do you move it?" you might ask. Well, you have no choice. You can't put it on top of disk 0 and there's only one other peg, so you move it where you're forced to move it. So after this, counting up to 11, that involves just flipping the last bit, so you move disk 0 again. Then, when your binary counting rolls over twice to the four's place, move disc number 2, and the pattern continues like this: flip the last, move disk 0, flip the last 2, move disc 1, flip the last, move disk 0. And here we're gonna have to roll over three times to the eight's place, and that corresponds to moving disc number 3. There's something magical about it, like when I first saw this, like this can't work. I don't know how this works, I don't know why this works. Now I know, but it's just magical when you see it and I remember putting together animation for this when I was teaching this, and just like... you know, I know how this works, I know all the things in it, it's still fun to just sit and just like you know... -Watch it play out? -Oh yeah. I mean, it's not even clear at first that this is always going to give legal moves. For example, how do you know that every time you're rolling over to the eight's place, the disc 3 is necessarily going to be freed up to move? At the same time the solution just immediately raise these questions like: where does this come from, why does this work, and is there a better way of doing this then having to do 2^(n-1) steps? It turns out not only does this solve towers of Hanoi, but it does it in the most efficient way possible. Understanding why this works and how it works and what the heck is going on comes down to a certain perspective on the puzzle - what CS folk might call a recursive perspective. Disc 3 is thinking, okay 2 1 and 0, you have to get off of me, I can't really function under this much weight and pressure. And so just from disc 3's perspective, if you want to figure out how is disc 3 going to get over here, Somehow, I don't care how, disc 2 1 0 have to get to spindle B. That's the only way they can move, If any of these are on top of 3, I can't move it, any of these are at spindle C, it can't move there. So somehow we have to get 2, 1 and 0 off. Having done that then we can move disc 3 over there. And then disc 3 says, I'm set, you never need to move me again, everyone else just figure out how to get here. And in a sense you now have a smaller version of the same problem: now you've got disc 0, 1 and 2 sitting on spindle B, we gotta get them to C. So the idea is that if I just focus on one disc and I think about what I'm going to have to do to get this disc to work, I can turn my bigger problem into something slightly smaller. And then how do I solve that? Well it's exactly the same thing, disc 2 is going to say, disc 1 and disc 0, you need to, you know, it's not you, it's me, I just need some space, get off. They need to move somewhere, then disc 2 can move to where it needs to go, then disc 1 and 0 can do this. But the interesting point is that every single disc pretty much has the exact same strategy: They all say, everybody above me, get off, then i'm going to move, ok everyone come back on. When you know that insight you can code up something that will solve towers of Hanoi in I think five or six lines of code, which probably has the highest ratio of intellectual investment to lines of code ever. And if you think about it for a bit, it becomes clear that this has to be the most efficient solution. At every step you're only doing what is forced upon you. You have to get discs 0 through 2 off before you can move disc 3, and you have to move disc three, and then you have to move disk 0 through 2 back on to it. There's just not any room for inefficiency from this perspective. So why does counting in binary capture this algorithm? Well what's going on here is that this pattern of solving a subproblem, moving a big disk, then solving a subproblem again, is perfectly paralleled by the pattern of binary counting: count up some amount, roll over, count up to that same amount again. And this towers of Hanoi algorithm and binary counting are both self similar processes, in the sense that if you zoom out and count up to a larger power of 2, or solve towers of Hanoi with more discs, they both still have that same structure: Subproblem, do a thing, subproblem. For example, at a pretty small scale, solving towers of Hanoi for two discs, move disc 0, move disc 1, move disc 0, is reflected by counting up to three in binary: flip the last bit, roll over once, flip the last bit. At a slightly larger scale, solving towers of Hanoi for three discs looks like: doing whatever it takes to solve two discs, move disc number 2, then do whatever it takes to solve two discs again. Analogously counting up to 111 in binary involves counting up to three, rolling over all three bits, then counting up three more. At all scales, both processes have this same breakdown. So in a sense, the reason that this binary solution works, at least an explanation, I feel like there's no one explanation, but, I think the most natural one is that the pattern you would use to generate these binary numbers has exactly the same structure as the pattern you would use for towers of Hanoi, which is why if you look at the bits flipping, you're effectively reversing this process. You're saying what process generated these, like if I were trying to understand how these bits were flipped to give me this thing, you're effectively reverse engineering the recursive algorithm for tower of Hanoi, which is why it works out. That's pretty cool, right? But it actually gets cooler, I haven't even gotten to how this relates to Sierpinski triangle. and that's exactly what I'm going to do in the following video, part 2. Many thanks to everybody who is supporting these videos on Patreon. I just finished the first chapter of Essence of Calculus, and I'm working on the second one right now, and Patreon supporters are getting early access to these videos before I publish the full series in a few months. this video and the next one are also supported by Desmos.
Just a quick reminder to everyoneβ if you love these videos as much as I do, you can support /u/3blue1brown on patreon here.
Another awesome video!
Part 2
This reminded me of the relation between solving the Towers of Hanoi and Hamiltonian paths that I read in /u/standupmaths's book.
Keith Schwarz (the guy he mentions in the video) was the teacher whose class convinced me to become a math major. I've honestly never seen anybody so excited to share math with the world.
Just found out he makes all his animations in python and built his own animation engine in python... Impressive. https://github.com/3b1b
Chapter 1 of Concrete Mathematics solves the Tower of Hanoi and Josephus problems by introducing some techniques for manipulating recurrence relations. There are some generalizations for the Tower of Hanoi especially in the chapter exercises. It's a fun read requiring no extensive background. Can't watch this video right now but I assume this is relevant... Somehow.
There's more ways to link binary numbers to Sierpinski's triangle. One simple algorithm goes like this: if you take integer coordinates of a 2D plane and do a binary AND function, if the result is 0, then the point is a part of the Sierpinski triangle. Here's a simple JS code example.
Are there more examples like this?
I love his videos, I recently finished my mechanical engineering degree and love finding these maths videos. If anyone knows of a website or books I can find to continue this math journey I am all ears
Does group theory have anything to say about this? I sense that there is a set of operations in both cases, and I sense that there is a sort of isomorphism between the operations.