The algorithmic trick that solves Rubik’s Cubes and breaks ciphers

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
how fast can you solve the rubik's cube felix zendex here managed to solve this one in just 4.73 seconds setting the world record at that time i can't compete with that but maybe my computer could if we redefine fast a little bit instead of time let's measure the number of moves it takes to solve a cube and let's find an algorithm that will help us solve any cube the fastest thereby in a way beating felix index you'll see that the algorithm will lead us to a surprisingly general technique that can also be used for completely different things such as breaking ciphers so stick with us so our problem is as follows imagine i give you a rubik's cube and ask you to solve it in the lowest number of moves possible okay first we actually need to define what we mean by move for simplicity let's say we never rotate the cube as a whole so the middle squares always stay in the same place then a move only rotates one side of the cube the top bottom left right front or back each of the sides can be rotated to three new positions so that gives us 18 possible moves in total from the scramble that felix was given you can solve the cube in 18 moves of course felix's solution was longer and used 43 moves that's actually quite wasteful in fact we know every cube can be solved in at most 20 moves it took people until 2010 to prove this more than 30 years after the invention of the rubik's cube the proof was done using a computer program you can find out more on this webpage with the bombastic headline of god's number is 20. all right so now let's try to design an algorithm to find the shortest solution for a scramble here's a useful way to view this problem the possible configurations of the cube form a vast network with two nodes being connected if there is a move from one to the other in computer science we call such networks graphs and their connections edges graphs tend to pop up often since they can represent very diverse concepts such as road networks friendships in a social graph or in our case an abstract network of configurations here you're only seeing a tiny part of the neighborhood of the solved cube remember every cube has 18 neighbors in the full graph okay that's nice and all but how do we connect all this to the problem we were originally trying to solve well solving a specific configuration just amounts to finding the shortest path in the graph from the scrambled configuration to the solved one and how do we find this shortest path the first solution that comes to mind is to start from the scrambled configuration and run the so breath first search algorithm simply put this algorithm explores the configurations ordered by their distance from the scrambled cube it keeps blindly searching like this until it encounters the solved cube problem solved then right well there is a small issue a quick wikipedia search reveals that the number of possible rubik's cube configurations is about 43 quintillion that's about 10 to the 20. so in the worst case our algorithm will have to explore about 10 to the 20 configurations until it finds a solution since a regular computer can explore about 10 to the six configurations per second this would take millions of years assume we're not patient enough to wait a million years and we'd like the solution in a few days at most in that time we could explore about 10 to the 10 cube configurations so sadly the simple solution doesn't cut it let's take a step back and think about what else we could do feel free to pause the video as well in our graph we have no clear sense of direction it's hard to tell which move will get us closer to the solution because of that we have to rely on breadth first search to blindly explore the neighborhood of our cube let's have a closer look at what actually happens when we do that starting from the solved cube breadth first search first explores these 18 configurations that we can reach with one move okay now let's see where two moves can get us ooh now there's already 243 configurations also notice that some of them can be reached in more than one way so what happens after three moves we'd love to show you but there's already more than three thousand configurations that's way too many to fit on the screen so clearly the number of configurations grows very quickly it looks like every step multiplies it by about 10 a bit more in the beginning and a bit less at the end but it makes sense to assume that the number of configurations reachable in n-steps is about 10 to the n that's actually a very special property not all graphs have this exponentially fast growth let's illustrate that by comparing with a road network graph consider this network of crossroads in manhattan here the number of explored nodes only grows quadratically after n steps we explore exactly 2n squared plus 2n plus 1 nodes which is roughly the same as 2n squared in 5 steps that makes 61 nodes whereas for the rubik's cube it's hundreds of thousands in this regard the rubik's cube graph is very similar to friendship graphs by that i mean graphs where the nodes are people and two nodes are connected if the corresponding people are friends example at the beginning of the video we showed you a clip from a speedcubing competition there may have been about 50 people there and the friendship graph of this competition could have looked something like this now recall the excited gentleman from the clip well it turns out if you start from him and only take three steps you visit all 50 people in the network in more human terms if you take anybody at the competition like felix here the excited guy knows him through at most two intermediate friends there's a famous generalization of this phenomenon called six degrees of separation it says that you can reach almost any person on earth through just six intermediate friends obviously you can't really check that but researchers have verified that on facebook you can reach pretty much anybody through just four intermediate friends if you think about it it actually makes sense since the average person has around 200 facebook friends the number of people reached will be multiplied by a hundred ish in every step okay that was a long detour let's get back to actually solving the rubik's cube if you remember we found that the problem was that we need first search to go to distance 20 but it would take a million years to do that if we want the program to finish in our lifetime we can only afford to go to distance 10. that only gets us halfway to the solve cube and the other half is much more computationally expensive we could also try searching from the solved cube but since the situation is symmetrical we run into the same issue well then is all hope lost not quite although going from either side doesn't cut it we can go from both that's because these two balls must intersect that means there's a configuration in the middle that we know how to reach from both sides if that were not the case the two cubes would have to be more than 20 moves apart and we know that's impossible great so let's turn that insight into an algorithm instead of searching from one side and trying to reach the other we'll search from both sides at once and stop once we meet in the middle when that happens there is a configuration that we know how to reach both from the solved cube and the scrambled one so we can simply combine these two partial paths to get a full solution it also must be the shortest one we'll leave the proof of that up to you this trick is called meet in the middle for obvious reasons and it's a lot better than what we had before since we have to explore up to distance 10 from either side we see about 2 times 10 to the 10 configurations whereas the original search needed to explore up to 10 to the 20. now we could run this algorithm for other graphs not just the rubik's cube graph but for it to work it's really crucial that the number of explored nodes grows exponentially in the number of steps that ensures that if we walk half the distance we only see a tiny fraction of the nodes if we were to run meat in the middle in the manhattan graph it wouldn't help us at all because it might happen that we end up exploring all the nodes anyways so far we've only talked about how fast the algorithm runs but there is another important aspect if we want to actually code this algorithm memory storing the necessary 10 billion states takes about 80 gigabytes so to actually run the algorithm we had to use a computer from our university's cluster that has this kind of memory it found the solution in about four hours which is pretty quick check out our code in the description it has a few more optimizations that we didn't talk about here there's also a bit about state-of-the-art algorithms for solving the rubik's cube they exploit some more specific properties of the cube graph which allows them to be even faster as i've mentioned the meet and the middle trick can also be used to solve other problems let me quickly mention another application that doesn't involve graphs of any kind in the 1970s people came up with a cipher called des which stands for data encryption standard it was the standard cipher that everyone was using at that time this cipher is a so-called symmetric cipher and these generally work as follows first you come up with a secret key that's just a binary string of 56 bits the cipher itself consists of two functions there's the encryption function which takes your super secret plain text and the encryption key and it produces a cipher text which to the naked eye just looks like a jumbled sequence of characters and the decryption function reverts this process so it takes the ciphertext and the same secret key and it gives you back the original plain text importantly this only holds when you use the same key for encryption and decryption otherwise what you get back from decryption is just complete garbage today the death cipher is no longer considered secure to see why let's join the dark side and try to break it imagine the scenario of the so-called known plane text attack you the attacker have somehow obtained both an original plain text and its encrypted counterpart now you want to figure out what the secret key is once you have the key you can use it to decrypt all other messages encrypted with the same key this uses a 56-bit key and that's not a lot these days you can just try all 2 to the 56 keys that's 10 to the 17 which is quite a lot but it's doable if you're the military for each key you just check whether encrypting the plaintext with that key gives you back the corresponding ciphertext if it does you have your secret key to defend against this people came up with a new cipher called triple s it's simply the old test but applied three times in a row with three different keys which gives a key of combined length of 168 bits okay so we have this and then there's triple this but whatever happened to double this i mean we could just apply the cipher two times first to get an intermediate encrypted message and then encrypting it once again with a different key that would give us a key of combined length of 112 bits that's 10 to the 34 possible keys and that's way too much for the same brute force approach to work but surprisingly double this is not at all safer than the ordinary this again let's say you know a plain text and its corresponding ciphertext first bruteforce all 2 to the 56 keys and save all possible intermediate encrypted texts to do this you will need thousands of terabytes but remember you are the military then as with the rubik's cube start working from the other side use the decryption function on the ciphertext with all possible keys whenever you decrypt the message try to find it in the huge database that you computed in the first step and again iterate over all two to the 56 possible keys until you find the match when that happens you found the two sub keys used in the cipher so that's why double this is not a thing you could try applying this to triple this as well but the number of sub keys is just way too large so i hope you can see the common pattern here in the case of the cube graph the naive algorithm would have to explore about 10 to the 20 configurations in the worst case and we managed to reduce that to about 10 to the 10 at the cost of roughly the same number of bytes of memory and for double this instead of brute forcing all 2 to the 112 keys we found a clever solution that reduced this number to only 2 to the 56 again using about the same amount of memory you might see the common pattern here the meat in the middle trick is useful when you want to search a large state space with let's say n states if everything goes well you can reduce the time needed from n to the much better square root of n but since there is no free lunch we have to pay for this speed up and we pay for it by memory the memory consumed is now also proportional to square root of n whereas for the naive algorithms we would only need a very small amount of memory and there you have it that was the powerful and general meat in the middle trick that can be used both for solving the rubik's cube and for breaking ciphers if you know other cool applications of this trick let us know in the comments if you don't also let us know in the comments and of course remember to like the video and subscribe for more algorithmic goodies till next time
Info
Channel: polylog
Views: 2,194,997
Rating: undefined out of 5
Keywords:
Id: wL3uWO-KLUE
Channel Id: undefined
Length: 14min 17sec (857 seconds)
Published: Sat Feb 05 2022
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.