Hamming codes and error correction

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments

LETS GOOO 3BLUE1BROWN

👍︎︎ 30 👤︎︎ u/LightIsLogical 📅︎︎ Sep 05 2020 đź—«︎ replies

Its also neat seeing the Walsh Hadamard functions fall out of the parity checks (at least in GF(2))

👍︎︎ 8 👤︎︎ u/[deleted] 📅︎︎ Sep 04 2020 đź—«︎ replies

6 or 14

👍︎︎ 2 👤︎︎ u/kenny122371 📅︎︎ Sep 04 2020 đź—«︎ replies

This video was great!

👍︎︎ 2 👤︎︎ u/MrBeltero 📅︎︎ Sep 05 2020 đź—«︎ replies

That was the shortest 20 minutes of my life

👍︎︎ 2 👤︎︎ u/Tillerino 📅︎︎ Sep 05 2020 đź—«︎ replies

Strong post title

👍︎︎ 1 👤︎︎ u/[deleted] 📅︎︎ Sep 05 2020 đź—«︎ replies

0101 1001 0110 1111

Yo

👍︎︎ 1 👤︎︎ u/CuriousConstant 📅︎︎ Sep 06 2020 đź—«︎ replies
Captions
Have you ever wondered how it’s possible to scratch a CD or DVD and still have it play back whatever it’s storing? The scratch really does affect the 1’s and 0’s on the disc, so it reads off different data from what was stored, but unless it’s really scratched up, the bits it reads are decoded into precisely the same file that was encoded on to it, a bit-for-bit copy, despite all those errors. There’s a whole pile of mathematical cleverness that allows us to store data, and just as importantly to transmit it, in a way that’s resilient to errors. Well, actually, it doesn’t take too much cleverness to come up with a way to do this. Any file, whether it’s a video, or sound, or text, code, an image, whatever, is ultimately stored as a sequence of 1’s and 0’s, and a simple strategy to correct any bit that gets flipped would be to store three copies of each bit. Then the machine reading the file could compare these three copies and always take the best two out of three whenever there’s a discrepancy. But what that means is using two thirds of your space for redundancy. And even then, for all that space given up, there’s no strong guarantee about what happens if more than 1 bit gets flipped. The much more interesting question is how to make it so that errors can be corrected while giving up as little space as possible. For example, using the method you’ll learn about in this video, you could store your data in 256-bit blocks, where each block uses nine bits –Nine!– to act as a kind of redundancy, the other 247 bits are free to carry whatever meaningful message or data you want. And it will still be the case that if a bit gets flipped, just by looking at this block and nothing more, a machine will be able to identify that there was an error and precisely where it was, so that it knows how to correct it. And honestly, that feels like magic. And for this particular scheme, if two bits get flipped, the machine will at least be able to detect that there were two errors, though it won’t know how to fix them. We’ll talk later about how this scales for blocks of a different size. Methods that let you correct errors like this are known, reasonably enough, as “Error correction code.” For the better part of the last century, this field has been a really rich source of surprisingly deep math being incorporated into devices we all use every day. The goal here is to give you a very thorough understanding of one of the earliest examples, known as a Hamming code. And by the way, the way I’m thinking about the structure of this video is less about explaining it as directly as possible, and more a matter of prompting you to invent it yourself, with some a little gentle guidance here and there. So when you feel like you see where it’s going at some point, take that moment to pause, actively predict what the scheme is going to be before I tell you. Also, if you want that understanding to get down to the hardware level, Ben Eater has made a video in conjunction with this one showing you how to actually implement Hamming codes on breadboards, which is extremely satisfying. Now, you should know, Hamming codes are not as widely used as more modern codes, like the Reed Solomon algorithm. But there is a certain magic to the contrast between just how impossible the task feels at the start, and how utterly reasonable it is once you learn about Hamming. The basic principle of error correction is that in a vast space of all possible messages, only some subset are going to be considered valid messages. As an analogy, think of correctly spelled words vs incorrectly spelled words. Whenever a valid message gets altered, the receiver is responsible for correcting what they see back to the nearest valid neighbor, as you might do with a typo. Coming up with a concrete algorithm to efficiently categorize messages like this, though, take a certain cleverness. The story begins in the 1940s, when a young Richard Hamming was working for Bell Labs and some of his work involved using a very big very expensive punchcard computer that he had only limited access to. And the programs he kept putting through it kept failing because every now and then a bit would get misread. Frustration being the crucible for invention, he got so fed up that he invented error correction codes. There are many different ways to frame Hamming codes, but as a first pass, we're going to go through it the way that Hamming himself thought about them. Let’s use an example that’s simple, but not too simple: A block of 16 bits. We’ll number the positions of these bits from 0 up to 15. The actual data we want to store is only going to make up 12 of these bits, while 4 of the positions will be reserved as a kind of redundancy. The word “redundant” here doesn’t simply mean copy, after all those four bits don’t give us enough room to blindly copy the data. Instead, they'll need to be a much more nuanced and clever kind of redundancy, not adding any new information, but adding resilience. You might expect these four special bits to come nicely packaged together, maybe at the end or something like that, but as you’ll see, having them sit in positions which are powers of 2 allows for something that's really elegant by the end. It also might give you a little hint about how this scales for larger blocks. Also, technically it ends up being only 11 bits of data; you’ll find there’s a mild nuance for what goes on with position 0, but don’t worry about that for now. Like any error correction algorithm, this will involve two players: A sender who is responsible for setting these four special bits, and then a receiver who is responsible for performing some checks and correcting any errors. Of course, the words “sender” and “receiver” really refer to machines or software that's doing all these checks, and the idea of a message is meant really broadly, to include things like storage. After all, storing data is the thing as sending a message, just from the past to the future instead of from one place to another. So that’s the setup, but before we can dive in, we need to talk about a related idea which was fresh on Hamming’s mind in the time of his discovery, a method which lets you detect any single-bit error, but not to correct them, known in the business as a “parity check”. For a parity check, we separate out only a single bit that the sender is responsible for tuning, and the rest are free to carry a message. The only job of this special bit is to make sure the total number of 1’s in the message is an even number. For example, right now that total number of 1’s is 7, that's odd, so the sender needs to flip that special bit to be a 1, making the count even. But if the block had already started off with an even number of 1’s, then this special bit would have been kept at a 0. This is pretty simple, deceptively simple, but it’s an incredibly elegant way to distill the idea of change anywhere in a message to be reflected in a single bit of information. Notice, if any bit of this message gets flipped, either from 0 to 1 or 1 to 0, it changes the total count of 1’s from being even to being odd. So if you're the receiver, you look at this message, and you see an odd number of 1’s, you can know, for sure, that some error occurred, even though you might have no idea where it was. In the jargon, whether a group of bits has an even or odd number of 1’s is known as its “parity”. You could also use numbers and say that the parity is 0 or 1, which is typically more helpful once you start doing math with the idea. And this special bit that the sender uses to control the parity is called a “parity bit”. And actually, we should be clear, if the receiver sees an odd parity, it doesn't necessarily mean there was just 1 error. There might have been 3 errors, or 5, or any other odd number, but they can know for sure that it wasn’t 0. On the other hand, if there had been 2 errors, or any even number of errors, that final count of 1’s would still be even, so the receiver can’t have full confidence that an even count necessarily means the message is error-free. You might complain that a method which gets messed up by only two bit flips is pretty weak, and you would be absolutely right! Keep in mind, though, there is no method for error detection, or correction, that could give you 100% confidence that the message you receive is the one the sender intended. After all, enough random noise can always change one valid message into another valid message just by pure chance. Instead, the goal is to come up with a scheme that's robust up to a certain maximum number of errors or maybe to reduce the probability of a false positive like this. Parity checks on their own are weak, but by distilling the idea of change across a full message down to a single bit, what they give us is a powerful building block for more sophisticated schemes. For example, as Hamming was searching for a way to identify where an error happened, not just that it happened, his key insight was that if you apply some parity checks, not to the full message, but to certain carefully selected subsets, you can ask a more refined series of questions that pin down the location of any single-bit error. The overall feeling is a bit like playing a game of 20 questions, asking yes or no queries that chop the space of possibilities in half. For example, let's say we do a parity check on just these 8 bits, all the odd-numbered positions. Then if an error is detected, it gives the receiver a little more information about where specifically the error is; namely that it’s in an odd position. If no error is detected among those 8 bits, it either means there's no error at all, or it sits somewhere in the even position. You might think that limiting a parity check to half the bits makes it less effective, but when it’s done in conjunction with other well-chosen parity checks, it counterintuitively gives us something more powerful. To actually set up that parity check, remember, it requires earmarking some special bit that has control for the parity of that full group. Here let's just choose position number 1. So for the example shown, the parity of these 8 bits is currently odd, so the sender is responsible for toggling that parity bit, and now it's even. This is only one out of four parity checks we’ll do. The second check is going to be among the 8 bits on the right half of the grid, at least as we've drawn it here. This time, we might use position number 2 as a parity bit. So these 8 bits already have an even parity, and the sender can feel good leaving that bit number 2 unchanged. Then, on the other end, if the receiver checks the parity of this group and they find that it’s odd, they’ll know that the error is somewhere among these 8 bits on the right. Otherwise, it means either there's no error, or the error is somewhere on the left half. Or I guess there could have been two errors, but for right now we’re going to assume that there’s at most one error in the entire block; things break down completely for more than that. Here, before looking at the next two checks, take a moment to think about what these first two allow us to do when you consider them together. Let's say you detect an error among the odd columns, and among the right half. It necessarily means the error is somewhere in the last column. If there was no error in the odd column, but there was one in the right half, well that tells you it's in the second to last column. Likewise, if there is an error in the odd columns but not the right half, you know that it's somewhere in the second column. And then, if neither of those two parity checks detects anything, it means the only place an error could be is in that leftmost column, but it also might simply mean there’s no error at all. Which is all a rather belabored way to say that two parity checks let us pin down the column. From here, you can probably guess what follows. We do basically the same thing, but for the rows. There’s going to be a parity check on the odd rows, using position 4 as a parity bit. So in this example, that group already has even parity, so bit 4 would be set to a 0. And finally there's a parity check on the bottom two rows, using position 8 as a parity bit. In this case, it looks like the sender needs to turn on bit 8 on in order to give that group an even parity. Just as the first two checks let us pin down the column, these next two let you pin down the row. As an example, imagine that during transmission there’s an error at, say position 3. Well, this affects the first parity group, and it also affects the second parity group, so the receiver knows that there's an error somewhere in that right column. But it doesn’t affect the third group, and it doesn't affect the fourth group. And that lets the receiver pinpoint the error up to the first row, which necessarily means position 3, so they can fix the error. You might enjoy taking a moment to convince yourself that the answers to these four questions really will always let you pin down a specific location, no matter where they turn out to be. In fact, the astute among you might even notice a connection between these questions and binary counting. And if you do, again let me emphasize, pause, try for yourself to draw the connection before I spoil it. If you’re wondering what happens if a parity bit itself gets affected…well you can just try it. Take a moment to think about how any error among these four special bits is going to be tracked down just like any other, with the same game of 4 questions. It doesn’t really matter, since at the end of the day what we want is to protect the message bits, the error correction bits are just riding along, but protecting those bits as well is something that naturally falls out of this scheme as a byproduct. You might also enjoy anticipating how this scales. If we used a block of size 256 bits, for example, in order to pin down a location, you need only 8 yes or no questions to binary search your way down to some specific spot. And remember, each question requires giving up only a single bit to set the appropriate parity check. Some of you may already see it, but we’ll talk later about the systematic way to find what these questions are in just a minute or two. Hopefully, this sketch is enough to appreciate the efficiency of what we’re developing here. Everything except for those 8 highlighted parity bits, can be whatever you want it to be, carrying whatever message or data you want. The 8 bits are “redundant” in the sense that they’re completely determined by the rest of the message, but it’s in a much smarter way than simply copying the message as a whole. And still, for so little given up, you would be able to identify and fix any single bit error. Well...almost. Okay, so, the one problem here is that if none of the four parity checks detect an error, meaning that the specially selected subsets of eights bits all have even parities just like the sender intended, then it either means there was no error at all, or it narrows us down into position 0. You see, with four yes or no questions, we have 16 possible outcomes for our parity checks. And at first, that feels perfect for pinpointing 1 of the 16 positions in the block. But you also need to communicate a 17th outcome, the “no error” condition. The solution here is actually pretty simple: Just forget about 0th bit entirely. So when we do our four parity checks, and we see that they're all even, it unambiguously means that there is no error. What that means, is that rather than working with a 16-bit block, work with a 15-bit block, where 11 of the bits are free to carry a message, and 4 of them are there for redundancy. And with that, we now have what people in the business would refer to as a “(15, 11) Hamming code”. That said, it is nice to have block size that's a clean power of 2, and there’s a clever way that we can keep that 0th bit around and get it to do a little extra work for us. If we use it as a parity bit across the whole block, it lets us actually detect, even though we can't correct, 2-bit errors. Here’s how it works. After setting those four special error-correcting bits, we set that 0th bit so that the parity of the full block is even, just like a normal parity check. Now if there’s a single-bit error, then the parity of the full block toggles to be odd, but we would catch that anyway thanks to our four error-correcting checks. However, if there are two errors, then the overall parity is going to toggle back to being even, but the receiver will still see that there’s been at least some error because of what's going on with the four usual parity checks. So if they notice an even parity overall, but something nonzero happening with the other checks, it tells them there were at least two errors. Isn’t that clever? Even though we can’t correct those two-bit errors, just by putting that one little bothersome 0th bit to work, it lets us detect them. This is pretty standard, it's known as an “extended” hamming code. Technically speaking, you now have a full description of what a Hamming code does, at least for the example of a 16-bit block. But I think you'll find it more satisfying to check your understanding and solidify everything up to this point by doing one full example from start to finish yourself. I’ll step through it with you, though, so you can check yourself. To set up a message, whether that’s a literal message that you’re transmitting over space, or some data that you want to store over time, the first step is to divide it up into 11-bit chunks. Each chunk is going to get packaged into an error-resistant 16-bit block, so let’s take this one as an example and actually work it out. Go ahead, actually do it, pause and try putting together this block ....okay, you ready? Remember, position 0 along with all the powers of 2 are reserved for error-correction duty, so you start by placing the message bits in all of the remaining spots, in order. You need this group to have even parity, which it already does, so you should have set the parity bit in position 1 to be a 0. The next group starts off with odd parity, so you should have set its parity bit to 1. The group after that starts with odd parity, so again you should have set it’s parity bit to 1. And the final group also has odd parity, meaning we set that bit in position 8 to be a 1. And then, as the final step, the full block now has even parity, meaning that you can set that bit number 0, the overarching parity bit, to be 0. So as this block is sent off, the parity of the four special subsets, and the block as a whole, will all be even, or zero. As the second part of the exercise, let’s have you play the role of the receiver. Of course, that would mean you don’t already know what this message is; maybe some of you memorized it but, let’s assume that you haven’t. What I’m going to do is change either zero, one or two of the bits in that block, and then ask you to figure out what it is that I did. So, again, pause and try working it out. ...okay, so you as the receiver now check the first parity group and you can see that it’s even, so any error that exists would have to be in an even column. The next check gives us an odd number, telling us both that there’s at least one error, and narrowing us down into this specific column. The third check is even, chopping down the possibilities even further. And the last parity check is odd, telling us there’s an error somewhere in the bottom, which by now we can see must be in position number 10. What’s more, the parity of the whole block is odd, giving us confidence that there was one flip and not two (if it’s three or more, all bets are off). After correcting that bit number 10, pulling out the 11 bits that were not used for correction gives us the relevant segment of the original message, which if you rewind and compare, is indeed exactly what we started the example with. And now that you know how to do all this by hand, I’d like to show you how you can carry out the core part of all of this logic with a single line of python code. You see, what I haven’t told you yet is just how elegant this algorithm really is, how simple it is to get a machine to point to the position of an error, how to systematically scale it, and how we can frame all of this as one single operation rather than multiple separate checks. To see what I mean, come join me in part 2.
Info
Channel: 3Blue1Brown
Views: 820,065
Rating: 4.9747996 out of 5
Keywords: Mathematics, three blue one brown, 3 blue 1 brown, 3b1b, 3brown1blue, 3 brown 1 blue, three brown one blue
Id: X8jsijhllIA
Channel Id: undefined
Length: 20min 5sec (1205 seconds)
Published: Fri Sep 04 2020
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.