Olympiad level counting (Generating functions)

Video Statistics and Information

Video

Captions Word Cloud

Reddit Comments

Captions

In a moment, I will ask you a puzzle, and it's a pretty hard puzzle, actually, but before I do, I want to lead with a spoiler, which is the fact that the way we're going to solve this involves the use of complex numbers. And once you hear it, you will agree that that seems absurd, given that the puzzle is going to be purely a discrete question. It only asks about whole numbers and their sums. There's not a whiff of the imaginary or even continuity anywhere on the horizon. It's certainly not the only time that complex numbers are unreasonably useful for discrete math, to borrow a phrase. The more famous example that I could bring up would be how the modern way that mathematicians understand prime numbers, you know, questions about how they're distributed, their density at certain regions, things like that, well, it involves studying specially designed functions whose inputs and outputs are complex numbers. Some of you may know that this is what the famous Riemann hypothesis is all about. Basically, there's a specially designed function, and on the face of it, it looks unrelated to the discrete world of primes. It's smooth, it's complex valued. But under the hood, it encodes all of the information that you could ever want about those discrete prime numbers. And most importantly, certain questions about primes are easier to answer by analyzing this function than they would be by directly analyzing the primes themselves. Of course, our puzzle, which I promise I'll share in just a moment, is a lot more innocent than the Riemann hypothesis. It's a toy problem. But at the end of the video, I'll share how the techniques that we use to solve it, the real reason that we're here, are actually pretty similar in spirit to the setup that leads to the Riemann hypothesis. And the prime number theorem and that whole circle of thoughts around it. Our puzzle for today comes from this book here by T2 Andrescu and Zhu Mingfeng. It's basically a collection of problems used in training the USA team for the International Math Olympiad. And if we turn to chapter 2, Advanced Problems, problem number 10 asks this seemingly innocent question. Find the number of subsets of the set 1 up to 2000, the sum of whose elements is divisible by 5. Okay, so that might take a little bit of a moment to parse. For example, something like the set 3, 1, 4, that would be a subset. All of its elements are also elements in the big set. And its sum, 3 plus 1 plus 4 is 8, so that wouldn't be considered. That's not in our count. Whereas something like the set 2, 3, 5, also a subset, has a sum of 10. That is divisible by 5, so it's one that we want to count. The preview animation that I had at the start is essentially a brute force program trying to answer this question. It will iterate through all of the different possible subsets, finding the sum of each one along the way, and it increments a counter each time that it finds a multiple of 5. And you know what, a nice warm-up question here would be to pause and think about how many total subsets are there overall? Forget this multiple of 5 stuff. How long will it take for this program to terminate? Many of you may know, the answer is 2 to the power 2,000. The basic idea there is that when you're constructing a subset, you have 2,000 different binary choices you can make. Do you include an element or do you not? And all of those choices are independent of each other, so the total number of choices you have in constructing a subset is 2 times 2 times 2 times 2 on and on 2,000 times. And thinking about our program, that is a monstrously huge number. So even if we gave this brute forcing approach all the time in the universe, with all the physical resources the universe could conceivably provide, it wouldn't even come close, it wouldn't scratch the surface. Obviously we have to be a lot cleverer than that. And if you were to just guess what the answer should be, make a rough approximation, you'd probably guess, you know, it should be around a fifth of all the total subsets. There's probably a roughly even distribution of all these sums mod 5. And yes, that is true, that's a decent approximation. But the heart of the question, the real challenge here, is to get a precise answer. This can't be the actual answer since it's not an integer, but is the true answer a little bit more or a little bit less? Or maybe it's a lot more or a lot less. What tactics could you possibly use to figure out that error? To be clear, this lesson is definitely much more about the journey than the destination. Will you ever need to filter and count subsets in this way? Almost certainly not, I wouldn't expect so. But toy problem or not, it is a legitimately challenging question, and navigating that challenge develops skills that are relevant to other sorts of challenging questions. For you and me, there are at least two very surprising and very beautiful twists and turns that the solution I'd like to share with you takes. I've already tipped my hand that complex numbers will make a surprise appearance, but before we even get to that, there is another strange turn, which is arguably even weirder and even more unexpected. To set the stage though, let's just get our bearings with the puzzle, and do what all good problem solvers should do, and start with a simpler example, maybe just trying it with the set 1, 2, 3, 4, 5. If you were solving this problem with pencil and paper, you know, you're one of these kids training for the IMO, it's not a bad idea to simply list out all 2 to the 5 subsets. It's only 32, it's not that many. There's different ways that you might want to organize all of these in your mind, but since the thing that we care about is their sum, the natural thing to do would be to go through all of them one by one and compute those sums. Over here, just doing it on YouTube, I've got a computer, so I'll cheat a little and show what all their sums are. I'll also cheat a little bit and rearrange all of these, organizing them suggestively into collections that all have the same sum. For instance, there are 3 distinct subsets that add up to 6, and they'll all sit in this little box, and the 3 subsets adding up to 10 will all live in this little box. And all in all, the ones that we care about, the subsets with a sum divisible by 5, have been put over here on the left, and it looks like there's a total of 8 of them. Oh, and by the way, I should say we are counting the empty set, we consider its sum to be 0, and we consider that to be a multiple of 5. By the end, I hope you'll agree all of those are abundantly natural choices to make. Take a moment to compare this answer to what you might expect heuristically. Out of all 32 total subsets, a fifth of that would have been 6.4, so at least in this small example, the true answer is a little bit bigger than that. That's maybe something you want to talk in the back of your mind. Okay, and this is the part of the video where, I'll be honest with you, I have no idea how to motivate it. Personally, I like it when math feels like something you could have discovered yourself, and if you and I were sitting down together solving this problem, I think there's all sorts of natural steps that you might take. Maybe you try to understand if there's some sort of structure to the subsets, or you play around with how these sums are distributed mod 5 at many different iterations for other small examples, and from that maybe you try to eke out some kind of proof by induction. When I shared an early version of this lesson with some patrons, people brought up some nice linear algebra approaches. All those are well and good, nothing wrong with those. But instead, my goal here is to teach you about something called a generating function. And it's one of those tactics where after the fact you can think, okay, yeah, I get that this works, but how on earth would you have thought of that? Honestly, I don't know. There's a time in your life before you understand generating functions, and a time after, and I can't think of anything that connects them other than a leap of faith. I'm going to ask you to consider the polynomial 1 plus x times 1 plus x squared times 1 plus x cubed times 1 plus x to the fourth times 1 plus x to the fifth. Now, I know you could rightfully ask, where does this come from? What do polynomials have to do with things? What is the variable x even supposed to represent right now? And essentially x is purely a symbol. The only reason that we've written a polynomial here is that the act of algebraically expanding it is going to completely mirror the act of constructing subsets. And, importantly, this grouping that we want, where subsets with the same sum are all bunched together, kind of happens automatically when you do this. And let me show you what I mean. When you expand out this expression, it basically comes down to making five binary choices. Which term from each parenthetical do you choose? If you choose the 1 from each of those parentheticals, that will correspond to the empty set where we don't choose any of the elements. Whereas if I choose the x to the 1 term and then ones from everything else, that will correspond to the singleton set that just contains the number 1. Similarly, if I choose the x squared term but ones from everything else, that corresponds to the set just containing 2. Just choosing the x cubed term corresponds to the set just containing the number 3. But, interestingly, notice what happens if I choose the x to the 1 term and the x squared term and then ones from everything else. This corresponds to the choice of the subset that has 1 and 2 and nothing from everything else. But in the polynomial, the way it expands looks like x cubed. So we have two different x cubed terms, each of which came from a subset whose sum was 3. And honestly, the pattern that I'm going for here is one that's probably easiest if you just take the time to pause and think through for yourself what happens when you expand everything here. Essentially, every possible subset corresponds to one of the terms in this expansion. And then the critical point is that the exponent in the term that you get from that expansion equals the sum of that corresponding subset. Kind of confusing when you say it out loud, but again if you just kind of think it through yourself I think you can see what I mean. For example, when all of the dust settles and we collect all 32 terms here, three of those terms are x to the 10th, and each of those came from a choice of elements whose sum was equal to 10. Now normally when we write a polynomial, we collect together all like terms. Instead of having three copies of x to the 10th, we would just see the coefficient 3 in front of x to the 10th. So each of these coefficients is a way of encoding the number of subsets with a particular sum. So this, like I said at the start, is an example of something called a generating function, where the idea is if you have some question with an answer associated with each positive integer, so in our case how many subsets add up to a particular value. When you construct a polynomial whose coefficients correspond to the answers to that question, you can get a surprising amount of insight from your original question by mathematically manipulating and analyzing the properties of this polynomial. There are tons and tons of examples of generating functions, but just to bring up one other one which is especially fun, you can use the same idea to study Fibonacci numbers. So all the coefficients of this polynomial will be Fibonacci numbers, and in this case it's an infinite polynomial, so I should really be calling it a power series. I won't fully explain the details here, but I will leave them up on the screen for anyone who's curious. Basic idea is that the rule that's used to define Fibonacci numbers, each one being the sum of the previous two, can be expressed as an equation in terms of this function. That equation in turn lets you write that function in an alternate form. And then, and here's most of the details I'm skipping over, if you manipulate that, you know, throw in a little partial fraction decomposition here, a little bit of geometric series power expansion there, you can get yourself an exact closed form expression for each individual Fibonacci number, which is really cool. I mentioned this really just to show the tip of the iceberg of the fact that this idea of a generating function goes way, way beyond our particular example. Now, in our particular problem, if we extend from the simple example with just 12345 to the big example with all the numbers up to 2000, our corresponding generating function involves these 2000 different binomial terms, you know, 1 plus x, 1 plus x squared, on and on, up to 1 plus x to the 2000. And the idea is that if you were to expand this, the coefficients tell us all the information we want. Now, it would be insane to actually expand it, but it is helpful to keep in the back of your mind, in principle, what that would look like. For example, in principle, if you expanded it, you would find that the coefficient in front of the x to the 25th term happens to be 142. And this corresponds to the fact that there are 142 distinct subsets that have a sum of 25. So the art of analyzing a generating function here will be to deduce facts about these coefficients without actually expanding the expression. So moving forward, I'm just going to write this expansion more abstractly, just a sum from n equals 0 up to capital N, where c sub n tells us the coefficients that we don't know. All of that starts off as a black box to us. And moving forward, we're going to start treating this as an actual function, something where we plug in x, we see what the output is, and then we ask, what does that tell us about the coefficients? For example, a very easy input would be to plug in something like x equals 0. In that case, importantly, we know how to evaluate it using the factored form above. If you plug in x equals 0 for everything, all of the terms look like 1, so the answer is 1. And in the expanded form, all of those terms involving an x will get killed, they go to 0, leaving us just with the first term, c sub 0. Now, in this case, that doesn't really tell us anything all that exciting. It essentially translates to saying there is a single empty set, but we're just getting our feet wet. As the next example, take a moment to think about evaluating f at 1. This is something we can do with the expression we know, when you plug in 1 for all of these x's, every term looks like a 2, so in total, we get 2 multiplied by itself 2,000 times. On the other hand, in the expanded expression, if you plug in x equals 1, all of these powers of x go to 1, so we're essentially adding up all of the coefficients, which is pretty cool when you think about it. Just by evaluating the function at a single number, we can deduce what the sum of all of the coefficients are. Now, again, in our particular example, it's not all that exciting, because we already know what the sum of these coefficients are. Remember, each coefficient counts how many subsets have a certain sum, and so when you add them up, we're just counting all of the subsets, which we know to be 2 to the 2,000. However, I can give you a genuinely new fact if I ask you to evaluate this function at negative 1. Take a moment to think about what that means. If you plug in negative 1, again, we start with the thing we know, the factored expression up top, and here, all you need is to look at the first term. When you plug in x, the first parenthetical goes to 0, so the whole expression has to be 0. But what does that tell you when we apply it to the expanded expression, using all of the coefficients? And in the spirit of being as suggestive as possible of the strange turns that this solution takes, I want you to really visualize the various powers of negative 1 in this expression in terms of rotations. The first term, negative 1 to the 0, is just 1, which we'll picture as a vector from 0 to 1. Then negative 1 to the first power is just negative 1 itself, which I want you to be thinking about as a 180-degree rotation away from that last term. Then when we take negative 1 squared, that's positive 1. Again, a 180-degree rotation. And in general, each successive term here looks like another rotation by 180 degrees. Algebraically, what this translates to is that we have an oscillating sum between the even coefficients and the odd coefficients, but keep the visual in the back of your mind. This expression is true for any generating function, but again, for our special generating function, we know that this value, this alternating sum, should equal 0. And a way you can interpret that is that it's telling you there's an equal balance between the even coefficients and the odd coefficients. And remember, maybe in the context of our smaller example, these coefficients are encoding for us facts about subsets. So if there's an equal balance between all those even coefficients and the odd coefficients, it's telling you that half of all the subsets have an even sum, and half of them have an odd sum. That's probably what you would expect, but it's not obvious at first how you would show that, and with the generating function, it just kind of pops right out. And again, to be suggestive of where we're going, let me rewrite this a little bit by taking the last two things we evaluated, add up those two, and then divide by one half. If you think about it, this is a way of filtering out all of the even coefficients and killing all of the odd coefficients. So it becomes an especially clean way to write the fact that the sum of all of the even coefficients, which again in the back of your mind means the total number of subsets with an even sum, will look like half of the total. This is, needless to say, tantalizingly close to the actual question we want to answer. What we would like to do is find some clever thing that we can do to the function f, some well-chosen numbers to evaluate it on, so that we get all the coefficients corresponding to multiples of 5. Again, thinking back to what these coefficients encode for us, that will be answering our final question. That will be counting the total number of subsets whose sum is divisible by 5. The trick to doing this is to generalize what we just did, where the successive powers of the input were rotating back and forth. But this time we don't want them to rotate every other time, we'd like them to somehow rotate with a period of 5. And to do that, we extend into the complex plane. You see, up there we can find a value so that as we take successive powers of it, it will rotate by a fifth of a turn, giving us a process with a frequency of 5. And if you step back, I know that it's kind of absurd that I'm asking you to think about complex numbers. I mean, we started with a counting question, it's discrete math, but hopefully it's not all that wild. And again, the reason that I'm drawing things out to tee up the various strange turns in the solution is that they're actually not all that strange in the broader scheme of math. The trick we're about to apply has a heavy resemblance to many other instances of using complex numbers to better understand discrete questions of integers. So the more it feels like something that you could have discovered yourself, the more it might actually be the case that when you're working on some future problem in this circle of thoughts, you will discover it yourself. The complex number To be specific, the complex number that I care about is one that I'm going to label zeta, and it sits a fifth of a turn around the unit circle. So its angle is 2 pi fifths radians, and its magnitude is one. This means with the standard Euler's formula notation, we would write that number explicitly as e to the power 2 pi i divided by 5. If you're not as comfortable with that notation, you could think of it as something whose real part is the cosine of 72 degrees, 72 being a fifth of a full turn, and the imaginary part is the sine of 72 degrees. But to be honest, you don't actually need to think about the explicit value. Instead, the important thing to focus on is the property that powers of this number have. For example, when you square it, because its magnitude was one, the magnitude of its square is also one, but it rotates a fifth of a turn around the unit circle, so it now sits two fifths of a turn around. Similarly, when you raise it to the third power, you end up three fifths of a turn around, raise it to the fourth power, you end up four fifths of a turn, and raise it to the fifth power, and you've gotten all the way back around to one. It's the same thing as if you had raised it to the zeroth power. We get this cycling every five terms. That's the thing that we care about. These numbers have a special name, they're called the fifth roots of unity, essentially because they solve the equation z to the fifth equals one. They are fifth roots of the number one. If you just presented someone with this equation, they would probably say the answer is clearly z equals one. But the idea is that there are four other answers in the complex plane. Four other numbers where when you raise them to the fifth, you get one, and considering them as a collective is often quite useful. Remember that equation, it'll come back for us a little bit later. So in analogy with what we did earlier, where we added together f of one and f of negative one to get this cancellation among the odd terms, what we're going to do is evaluate f at all five of these numbers, and then add them together, and hopefully we get some cancellation. That might seem kind of complicated, but let's just take a super simple example, like the case where f of x is simply equal to x. In that case, when we add up these five terms, we're just adding up the roots of unity themselves. Zeta to the zero plus zeta to the one, on and on, up to zeta to the fourth. When you add complex numbers, you can think of it like vector addition with the tip to the tail. So zeta to the zero plus zeta will look like this, and then if I add on zeta squared, bringing the tail of that vector to the tip of the last one, we get this. Then similarly, if I bring the tail of zeta cubed over to the tip of that one, and then do likewise for zeta to the fourth, you'll see how the overall sum actually loops back to be zero. Another way to think about this is that all five of these terms are evenly balanced around the number zero. Their center of mass is at the origin. Now it's helpful to think about a slightly less trivial example, if f of x was x squared. So when you square zeta to the zero, it stays zeta to the zero. This is just a fancy way of saying the number one. When you square zeta, you get zeta squared itself. So you might imagine this dot up here moving over to the zeta squared dot when we do it. Zeta squared moves to zeta to the fourth. You might imagine this dot moving over to zeta to the fourth. Zeta cubed moves to zeta to the sixth, which, because we loop around every five times, is the same thing as zeta to the one. So this dot will move up here. And finally, zeta to the fourth squares to give us zeta to the eighth, which reduces to be the same as zeta cubed, which I might draw like this. That might seem a little confusing to think about, especially with all the arrows I have drawn here, but it's worth thinking through at least once in your life, because the idea here is that when we square this, like go to all of these different terms, and I program them to double the angle that they have, the overall effect is to just shuffle those terms. We get the same numbers but written in a different order, so their sum is still going to be zero. Similarly, if you go through this exercise with x cubed, which I encourage you to do, and you follow around where are each one of these dots going to end up, you'll be able to see that when we cube these terms, when we take each one and we multiply the angle that it has by three, again we just shuffle them around. Same terms listed in a different order, unsurprisingly the same thing happens if our function was x to the fourth, but, critically, where things change is if we consider the function x to the fifth. In that case, when you raise zeta to the fifth power, by definition it goes to one. Similarly, zeta squared raised to the fifth power goes to one. All of these go to one, they are the roots of unity, this is after all their whole purpose in life. So in this case, when we apply the function and add them all up, instead of going to zero and getting cancellation, we get a kind of constructive interference. All of them equal one, so their sum is equal to five. So if you step back and think about what all those examples mean, essentially this expression is something that will go to zero for powers of x which are not divisible by five, but it goes to something non-zero for powers of x which are divisible by five. And that's exactly the kind of filter that we're looking for. If you're worried that our actual function is much more complicated than a simple power of x, essentially things play really nicely here because everything is linear. If f is some massive polynomial and we want to evaluate this big sum, you could sort of think of going column by column, where each time you really are just adding up powers of zeta, and in most cases all those powers cancel out with each other and you get zero, but when all of those powers are multiples of five, they constructively interfere and instead you get five times whatever the corresponding coefficient is. Deep in the weeds it's easy to forget why we're here in the first place, but remember each one of those coefficients tells us how many subsets add up to a certain value, and so what we want is to add up all of the coefficients that are multiples of five, and what we have right now is a way to explicitly do that. If we evaluate this function on these five different roots of unity, which I know seems kind of weird, then all we have to do is divide by five and it gives us the sum that we want. That's really cool if you ask me. We have a question that's just about subsets, it's a discrete math problem, and yet the way that we can answer it is to evaluate a crazy polynomial on some judiciously chosen complex numbers. The more math you do, the less crazy that seems, because complex numbers have this bizarre relationship with discrete math, but it really is wonderful, there's no two ways about it. However, some of you might complain, the only way that this is useful is if we can actually evaluate this wild expression on our polynomial. Remember, the form of the polynomial we know, the one we're comfortable with, is the factored form, where you have this one plus x, one plus x squared, on and on, all the way up to one plus x to the two thousand. Everything up to this point is just meaningless symbolic play, pushing around one hard problem into another, unless we can actually roll up our sleeves and do some honest calculation here. This is the final thrust in our argument, so step back, take a deep breath. It's actually not as bad as you might think, but let's start just by thinking about how you might evaluate just one of the roots of unity that we need, maybe zeta itself. So what that looks like is one plus zeta, times one plus zeta squared, times one plus zeta cubed, on and on. Except, importantly, after those first five terms, everything starts repeating, because powers of zeta repeat. The entire expression up to two thousand is basically just going to be a copy of this expression four hundred times. It still might seem hard to evaluate this expression, but it's way easier than multiplying out two thousand different terms. A way you might visualize this is that we're taking each one of those roots of unity, but basically adding one, we're shifting them all to the right. This picture actually lends itself to a really nice geometric intuition for the numerical answer that we might expect. The thing that we want is the product of these five different complex numbers, these five yellow dots. And if you know a thing or two about complex numbers, since these come in conjugate pairs, all we really need is to multiply the lengths of these five yellow lines. For example, that dot furthest to the right corresponds to one plus zeta to the fifth, which in the diagram I'm labeling as zeta to the zero plus one. But it doesn't matter, in either case, they're both just fancy ways of writing the number two. Next to that, we have the values one plus zeta and one plus zeta to the fourth, both of which have the same magnitude, the lengths of these lines are the same. And let's just give that a name, L1. So we need to multiply two different copies of that length, L1 squared. Similarly, the remaining two values, zeta squared plus one and zeta cubed plus one, they also have the same length, and they're a conjugate pair. So let's just call that length L2. So our product needs to include two copies of that L2. If we were just making a loose heuristic guess, you might notice that L1 is a length that's something a little bit longer than one, and L2 is something a little bit shorter than one. So the final answer here probably comes to something around two-ish, we're not positive, but something in that ballpark. To turn this into an exact answer, we could just expand out the full expression. It's honestly not that bad, there's only 32 different terms. Okay, you've hung with me for a long time now, and I know that it's getting to be a lot. But there's one final trick in this whole argument that makes our last step much simpler than you might think it should be. And let's just recap to remind ourselves of where we are. So we started with this question asking us, count the number of subsets of 1 up to 2000 whose sum is divisible by 5. We then constructed this polynomial whose coefficients tell us how many subsets have a particular sum for each value n. So what we want is to add up every fifth coefficient of that polynomial. Then we saw how evaluating this polynomial as a function on all of the fifth roots of unity, then adding them up, ends up giving us exactly this filter that we want. And here we're evaluating just one of those terms, f of zeta, which essentially comes down to a product of five complex numbers. As a super slick way to actually evaluate that product, here's the final trick. Remember, I described these numbers as roots of unity. They solve the equation z to the fifth equals one. Another way to think about that is that they are roots of the polynomial z to the fifth minus one. Now what that means is we can factor the polynomial z to the fifth minus one to look like this, where there's one factor corresponding to each one of the roots. You take z minus each one of the roots. This expression is kind of magical when you think about all of the crazy cancellation that has to happen when you expand it all out, but it is true and it's super useful for us right now, because the expression on the right hand side looks almost identical to the thing we need to evaluate up at the top here. It basically just has minus signs where we wish there were plus signs. The trick is to plug in z equals negative one. If you do that, you essentially have the negative of what we want. If you multiply it by negative one, notice how the left hand side here, which started out as negative one minus one or negative two, that just becomes two. And then the right hand side turns into the thing that we want to evaluate. So just as our geometric intuition earlier might have suggested, not only is the answer around two, the answer quite magically turns out to be precisely two. That is actually super nice and very lovely, because it means this bigger expression that we want to evaluate, where we're adding up f on all of the different roots of unity, we know its value on the first root of unity. It will be two to the power four hundred. Essentially identical reasoning shows that its value on the next three roots of unity is also two to the power four hundred, because remember when you take powers of zeta squared or zeta cubed, you get the same list of numbers that are just shuffled in a different order. The only one that's different is when we evaluate it as zeta to the zero. But zeta to the zero is a fancy way of saying the number one, and we know how to evaluate this at one. That's one of the easy things. We did this earlier. All of these parentheticals turn into two, so it looks like taking two multiplied by itself two thousand times. And so finally with that, we have a highly explicit honest answer to our counting question. To add up all of these coefficients which are divisible by five, which remember is a way of counting how many total subsets have a sum divisible by five, the answer is one fifth of this weird complex expression, which we just computed to be two to the two thousand plus four different copies of two to the four hundred. And here you might want to do just a quick sanity check on does this answer make any sense. For example, if you do it in the smaller case with the set one two three four five, and you walk through all the same reasoning that we just did, it tells you that the answer is one fifth of two to the fifth, the total number of subsets, plus four times two to the one in this case, which is a fifth of 32 plus eight, which is eight. And if you'll remember when we explicitly looked at them all, that was in fact the answer. Look, this is a hard puzzle, and when it's worth putting in the time to solve a hard problem, it's also worth taking some time to reflect on it. What do you get out of this? What's the takeaway? Now you could reflect on the answer itself, how the dominant part is indeed one fifth of all the total subsets like we might have guessed, and how this error term came about from the not quite destructive interference in a massive combination of roots of unity. But again, what makes this question interesting is not the answer, it's the way that we solved it, namely taking a discrete sequence that we want to understand and treating it as the coefficients on a polynomial, then evaluating that polynomial on complex values. Both of those steps are probably highly unexpected at the outset, but both of those steps relate to some very general and powerful techniques that you'll find elsewhere in math. For example, at the top of the lesson, I promised that the technique that we would use would be similar in spirit to the way that primes are studied, and the set of ideas that leads up to the Riemann hypothesis and things like that. Now this is a very beautiful topic, enough so that I think it seems a little criminal to cram some kind of rushed version into the end here. The right thing to do, I think, is to just make that video I promised a while back about the zeta function, take the time, do it right. But if you're curious, and if you'll allow me to throw some things up on the screen without explaining them, here's the two or three sentence version of how the two are parallel. Just like our subsets puzzle, the way that Riemann studied primes involved a discrete sequence we want to understand, something carrying information about prime numbers, and then considering a function whose coefficients are the terms in that sequence. In that case, it's not quite a polynomial, instead it's a related structure known as a Dirichlet series, or Dirichlet series depending on who you ask, but it's the same essential idea. Then the way to suss out information about those coefficients comes from studying how this function behaves with, you guessed it, complex valued inputs. The techniques in his case get a lot more sophisticated, after all Riemann was a pioneer in complex analysis, but the fact remains extending your domain beyond real numbers like this offers you, the mathematician, a lot more power in making deductions about the coefficients. For some viewers this all might leave the lingering question of why exactly complex numbers are so unreasonably useful in this way. It's a hard question to answer exactly, but if you think about our puzzle, everything we just did, as soon as we were in this situation where plugging in different inputs revealed hidden information about the coefficients, it's sort of like the more inputs you can work with the better, so you might as well open yourself up to a richer space of numbers like the complex plane. But there is a more specific intuition that I want you to come away with here. In our puzzle the relevant fact that we wanted, the sum of every fifth coefficient, was a kind of frequency question, and the real reason the complex numbers as opposed to some other structure proved to be useful for us is that we could find a value so that successive products have this cycling behavior. This use of values on the unit circle and roots of unity in particular to suss out frequency information is extremely fruitful. It is almost impossible to overstate how helpful that idea is. Just to give one out of thousands of examples, in the 1990s Peter Shor found a way for quantum computers to factor large numbers way way faster than classical computers can. And if you go in and you look at the details of how what we now call Shor's algorithm works, the idea is essentially this, the use of roots of unity to detect a kind of frequency information. More generally this is the core idea that underlies Fourier transforms and Fourier series and the infinite swell of topics that follow from those. As to the topic of generating functions themselves, we've really only just scratched the surface here, and if you want to learn more I highly recommend this kind of hilariously named book Generating Functionology by Herbert Wilf. And I'll also leave up a few fun puzzles on the screen here for anyone who wants to flex their muscles a bit with the idea.

Info

Channel: 3Blue1Brown

Views: 1,893,072

Rating: undefined out of 5

Keywords: Mathematics, three blue one brown, 3 blue 1 brown, 3b1b, 3brown1blue, 3 brown 1 blue, three brown one blue

Id: bOXCLR3Wric

Channel Id: undefined

Length: 34min 35sec (2075 seconds)

Published: Mon May 23 2022