This is a video I’ve been excited to make
for a while. The story here braids together prime numbers, complex numbers and pi in a
very pleasing trio. Quite often in modern math, especially that
which flirts with the Riemann zeta function, these three seemingly unrelated objects show
up in unison, and I want to give you a peek at one instance where this happens; one of
the few that doesn’t require too heavy a technical background.
That’s not to say it’s easy –in fact this is one of the most intricate videos I’ve
ever done– but the culmination is worth it.
What we’ll end up with is a formula for pi, a certain alternating infinite sum.
This formula is actually written on the mug I’m drinking coffee from as I write this,
and a fun but almost certainly apocryphal story is that the beauty of this formula is
what inspired Leibniz to quit being a lawyer to instead pursue math.
Whenever you see pi show up in math, there’s a circle hiding somewhere, sometimes very
sneakily. So the goal here is not just to discover this sum, but to really understand
the circle hiding behind it. You see, there’s another way to prove the
same result that you and I will spend some meaningful time building to with just a few
lines of calculus. But it’s one of those proofs that leaves you thinking “Okay, I
suppose that’s true”, without a sense for why, for where the hidden circle is.
On path you and I will take, what you’ll see is that fundamental truth behind this
sum and the circle it hides is a certain regularity in prime numbers behave within complex numbers. To start the story, imagine yourself with
nothing more than a pencil, some paper, and a desire to find a formula for computing pi.
There are countless ways to approach this, but as a broad outline for the plotline hear,
you’ll start by asking how many lattice points of the plane sit inside a big circle.
That question can lead asking about how to express numbers as the sum of two squares,
which in turn will lead to factoring integers inside the complex plane.
From there, bringing in a function named chi will give us a formula for pi that at first
seems to involve a crazy complicated pattern dependent on the distribution of primes, but
a slight shift in perspective will simplify it dramatically and expose the ultimate gold
nugget. It’s a lot, but good math takes time, and
we’ll take it step by step. When I say “lattice point”, I mean a point
(a, b) in the plane where a and b are both integers, a point where grid lines cross.
If you draw a circle centered at the origin, say with radius 10, how many lattice points
would you guess are inside that circle? Well, there’s one lattice point for each
unit of area, so the answer should be approximately equal to the area of the circle, pi*R2, which
in this case is pi(102). And for a really big radius, like 1,000,000, you’d expect
this to be more accurate, in the sense that the percent error between the estimate pi*R2
and the actual count of lattice points should get smaller.
If you can find a second way to answer the same question, how many lattice points are
in this circle, it might lead you to another way to express the area of a circle, and hence
another way to express pi. So you play! And you wonder. And maybe, especially
if you just watched a certain calculus video, you might try looking through every possible
ring that a lattice point could sit on. For each of those points, (a, b), its distance
from the origin in the square root of a2+b2, and since a and b are both integers, a2+b2
is also an integer, so you only have to consider rings whose radii are the square roots of
whole numbers. A radius of 0 just gives that single origin
point. A radius of 1 hits 4 lattice points… radius sqrt(2) hits 4 lattice points as well...sqrt(3)
doesn’t hit any lattice points...sqrt(4) hits four again...a radius of sqrt(5) actually
hits 8 lattice points... So what you need is a systematic way to count
how many lattice points are a given distance away from the origin, and to tally them all
up. If you pause and try this for a moment, you’ll
see that the pattern seems pretty chaotic, which is a good sign that some very interesting
math will come into play. In fact, as you’ll see, this pattern is
rooted in the distribution of primes. For example, look at the ring with radius
sqrt(25). It hits (5, 0), since 52 + 02 = 25. It also
hits (4, 3), since 42+32 = 25, and likewise it hits (3, 4).... and (0, 5)...
What’s really happening here is that you’re counting how many pairs of integers (a, b)
have the property that a2 + b2 = 25, and it looks like there’s a total of 12.
As another example, look at the ring with radius sqrt(11). It doesn’t hit any lattice
points, which corresponds with the fact that you cannot find two integers whose squares
add up to 11. Now, many times in math when you see a question
that has to do with the 2d plane, it can be surprisingly fruitful to just ask what it
looks like when you think of that plane as the set of all complex numbers.
So instead of thinking of this lattice point as the pair of integer coordinates (3, 4),
think of it as the single complex number 3 + 4i.
That way, another way to think of the sum of the squares of its coordinates, 32+42,
is to multiply this number by 3 - 4i. This is called its “complex conjugate”,
it’s what you get by reflecting over the real axis, replacing i with -i.
This might seem like a strange step if you don’t have much of a history with complex
numbers, but describing this distance as a product can be unexpectedly useful. It’ll
turn our question into a factoring problem, which is ultimately why patterns among prime
numbers come into play. Algebraically, this relation is straight-forward
enough to verify. You get a 32, then the 3*(-4i) cancels with the 4i*3, then you have -(4i)2,
which because i2 is -1 becomes +42. This is also quite nice to see geometrically;
and if the following feels unfamiliar, I do have another video where I go into more detail
about why complex multiplication looks the way it does.
The number 3+4i has a magnitude of 5, and some angle off the horizontal. Multiplying
by 3-4i rotates by that same angle in the opposite direction, putting it on the positive
real axis, then stretches by that same magnitude, 5, meaning you land on the number 25, the
square of the magnitude. The collection of all these lattice points
a + bi, where a and b are integers, has a special name: the “Gaussian integers”,
named after Martin Sheen. Geometrically, you will still be asking the
same question, how many of these lattice points (Gaussian integers) are a given distance away
from the origin, like sqrt(25). But we’re just phrasing it in a more algebraic way:
How many gaussian integers have the property that multiplying by their complex conjugate
gives 25? This might seem needlessly complex, but it’s
the key to understanding the seemingly random pattern for how many lattice points are a
given distance from the origin. To see why, we need to understand how numbers
factor inside the Gaussian integers. As a quick refresher, among the ordinary integers,
every number can be factored as some unique collection of prime numbers. For example,
2,250 can be factored as 2*32*53, and there is no other collection of primes that also
multiplies to 2,250. Unless you just make some of the primes in
this factorization negative. So really, factorization in the integers is
not perfectly unique, it’s almost unique, with the exception that some of the factors
might be multiplied by -1. Analogy with primes
Factoring works very similarly in Gaussian integers.
Some numbers, like 5, can be factored into smaller Gaussian integers, in this case (2+i)(2-i).
This Gaussian integer (2+i), cannot be factored into anything smaller, so we call it a “Gaussian
prime”. Again, this factorization is almost unique.
But this time, not only can you multiply each of the factors by -1 to get a factorization
that looks different, you can also be extra sneaky by multiplying one factor by i, and
another by -1. This gives a different way to factor 5 into Gaussian primes.
But other than the things you can get by multiplying some of these factors by -1, i or -i, factorization
within the Gaussian integers is unique. If you can figure out how ordinary primes
numbers factor inside the Gaussian integers, it’ll tell you all all other integers factor,
so here we pull in a crucial and surprising fact.
Primes which are 1 above a multiple of 4, like 5, 13, 17, can be always factored into
exactly 2 distinct gaussian primes. This corresponds with the fact that rings
with radii equal to the square root of one of these primes always hit lattice points.
In fact they always hit 8, as you’ll see in a bit.
Primes that are 3 above a multiple of 4, like 3, 7, 11 and so on, cannot be factored further
in the Gaussian integers. Not only are they primes in the integers, they are also Gaussian
primes, unsplittable even when i is in the picture.
This corresponds with the fact that a ring whose radius is the square root of one of
these will never hit any lattice points. This pattern is the regularity within primes
that we’ll ultimately exploit. And in a later video I might explain why on earth this
is true; why a prime number’s remainder when divided by 4 has anything to do with
whether or not it factors inside the Gaussian integers, and hence, whether it can be expressed
as the sum of two squares. But here we’ll take it as given.
The prime number 2 is special, because it does factor, as (1+i)(1-i), but these two
gaussian primes are a 90o rotation away from each other, so you can multiply one by i to
get the other. And that fact will make us want to treat 2 a little differently for where
this is all going. Remember, our goal here is to count how many
lattice points are a given distance away from the origin. Doing this systematically for
all distances sqrt(N) can lead us to a formula for pi.
And counting the number of lattice points with a given magnitude, like sqrt(25), is
the same as asking how many Gaussian integers have the property that when you multiply by
its complex conjugate, you get 25. So here’s the recipe for finding all Gaussian
integer with this property. First, factor 25, which in the ordinary integers
is 52, but since 5 factors further as (2+i)(2-i), 25 breaks down into these four Gaussian primes.
Next, organize these into two columns, with conjugate pairs sitting right next to each
other. Now multiply what’s in each column, and
you’ll come out with two Gaussian integers. Because everything in the right is a conjugate
to everything in the left, what comes out will be a complex conjugate pair, which multiply
to make 25. Picking an arbitrary standard, let’s call
that product from the left column the output of our recipe.
Notice, there are three choices for how to divvy up the primes that can affect that output:
Here, both copies of 2+i are in the left column, and that left column product is 3+4i; you
could also have only one copy of 2+i is in the left column, in which case that product
is 5; or both copies of 2+i are in the right column, which will give an output of 3-4i.
Those three possible outputs are all lattice points on the circle with radius sqrt(25).
So, why does this recipe not yet capture all 12 lattice points?
Well, remember how I mentioned that a factorization into gaussian primes can look different if
you multiply some of them by i, -1 or -i? If you write the factorization of 25 differently,
maybe splitting up one of those 5’s as (-1+2i)(-1-2i), it can affect our result.
But the only effect that would have is to multiply the total output by i, -1 or -i,
so as a final step to our recipe, say you have to make one of 4 choices: Multiply the
product from the left column either by 1, i, -1 or -i.
The three lattice points we originally found, can each be multiplied by i, -1, or -i, and
that accounts for all 12 ways to construct a gaussian integer whose product with its
own conjugate is 25. Extend to 53
The best way to see how this recipe counts lattice points more generally is to just see
more examples. If instead we were looking at 125, which is
53, we would have had 4 choices for how to divvy up primes conjugate pairs into the two
columns, either including 0, 1, 2, or all three copies of 2+i in that left column.
Those four choices, multiplied by final four choices of multiplying the product from the
left column by 1, i, -1 or -i, would suggest there are 16 lattice points a distance of
sqrt(125) away from the origin. And indeed, if you draw that circle and count,
you’ll find that it hits exactly 16 lattice points If you introduce a factor like 3, which doesn’t
break down into the product of two conjugate Gaussian primes, it really mucks up the whole
system. When you’re divvying up primes between the
two columns, there’s no way to split up that 3. No matter where you put it, it leaves
the columns imbalanced, and when you take the product of all the numbers in each column
you wouldn’t end up with a conjugate pair. So for a number like 3*53, which is 375, there’s
actually no lattice point you hit; no Gaussian integer whose product with its own conjugate
gives 375. Extend to 32*53
But if you add a second factor of 3, then you have an option. You can throw one 3 in
the left column, and the other in the right. Since 3 is its own complex conjugate, this
keeps things balanced, in the sense that the products of the left and right columns will
be complex conjugates of each other. But it doesn’t add any new options. There
will still be a total of 4 choices for how to divvy up those factors if 5, multiplied
by the final 4 choices of multiplying by 1, i, -1 or -i.
So just like the sqrt(125) circle, this guy also hits exactly 16 lattice points. Let’s just sum up where we are.
When you’re counting how many lattice points lie on a circle with radius sqrt(N), the first
step is to factor N. For prime factors like 5, 13, and 17, which
factor into a conjugate pair of Gaussian primes, the number of choices you have is one more
than the exponent that shows up with that factor.
For prime factors like 3, 7 and 11, which are already Gaussian primes and can’t be
split, if they show up with an even power, you have one and only one choice for what
to do with them. If it’s an odd exponent, you’re screwed, you have zero choices.
And no matter what, you have those 4 choices at the end. By the way, this process right here is, I
think, the most complicated part of the video. It took me a couple times to think through
that yes, this is a valid way to count lattice points, so don’t be shy if you want to pause
and scribble things down to get a feel for it.
The one last thing to mention is how factors of 2 affect the count. If your number is even, the factor of 2 breaks
down as (1+i)(1-i), so you can divvy up that conjugate pair between the columns. At first
it might look like this doubles your options, depending on how you choose divvy these up
between the columns. However, since multiplying one of these gaussian
primes by i gives you the other one, if you swap them between the columns, the effect
on the output from the left column is to multiply it by i or -i.
So this is redundant with that final step, where we take the product of the left column
and choose to multiply it either by 1, i, -1 or -i.
This means a factor of 2, or any power of 2, doesn’t actually change the count at
all; it doesn’t hurt, it doesn’t help. For example, a circle with radius sqrt(5)
hits 8 points, and one with radius sqrt(10) also hits 8 points, as does sqrt(20)... and
sqrt(40). Factors of 2 just don’t make a difference. What’s about to happen is number theory
at its best. We have this complicated recipe telling us
how many lattice points sit on a circle with radius sqrt(N), and it depends on the prime
factorization of N. To turn it into something simpler, we’re
going to exploit the regularity of prime numbers that those which are 1 above a multiple of
4 split into distinct gaussian prime factors, while those that are 3 above a multiple of
4 cannot be split. Introduce chi
To do this, we’ll bring in a simple function, which I’ll label with the greek letter “chi”.
For inputs 1 above a multiple of 4, the output of chi is 1. For inputs 3 above a multiple
of 4, it outputs -1. And for even numbers, it gives 0.
So if you evaluate chi on the natural numbers, it gives this nice cyclic pattern 1, 0, -1,
0, and repeat. Chi has a special property, it’s what’s
called a “multiplicative” function. If you evaluate it on two number and multiply,
like chi(3)*chi(5), the result is the same as chi evaluated on the product of those two
numbers, in this case chi(15). Likewise, chi(5)*chi(5) = chi(25)...and this works for any two numbers,
go ahead an try it. Show answer to counting question in terms
of chi So, for our central question of counting lattice
points in this way that involves factoring a number, I’m going to write the number
of choices we have using chi in what at first seems to be a more complicated way, but it
has the benefit of treating all prime factors the same way.
For each prime power, like 53, write (chi(1) + chi(5) + chi(52) + chi(53)), adding up the
value of chi on all the power of this prime up to the one that shows up in the factorization.
Since 5 is 1 above a multiple of 4, all of these are just 1, so this sum comes out to
4, which reflects the fact that the factor of 53 gives us 4 options for how to divvy
up its two Gaussian prime factors between the columns.
For something like 34, this looks like (chi(1) + chi(3) + chi(32) + etc.). Since chi(3) is
-1, this sum oscillates, 1 - 1 + 1, etc. If it’s an even power, like 4 in this case,
the sum comes out to 1, which encapsulates the fact that there is only 1 choice for what
to do with those unsplittable 3’s. If it’s an odd power, that sum is 0, indicating that
we have no choices. For powers of two, this looks like (1 + 0
+ 0 + etc.), indicating that with a factor of 2, we always have 1 choice for what to
do with it, it neither helps nor hurts the lattice point cause.
And as always, that 4 in front reflects the final choice of multiplying the output by
1, i, -1 or -i. We’re getting close to the culmination now,
things are starting to look organized, so take a moment, pause and ponder, make sure
this all feels good. Take the number 45 as an example, which factors
as 325. The expression for the number of lattice points is 4*(1 + chi(3) + chi(32))(1 + chi(5)),
which is the same as 4*(1 choice for what to do with the 3’s)*(2 choices for how to
divvy up the Gaussian prime factors of 5). It might seem like expanding out this sum
is really complicated, because it involves looking at all possible combinations of these
prime factors. But because chi is multiplicative, each one
of these combinations corresponds to a divisor of 45, so really this looks like 4*(chi(1)
+ chi(3) + chi(5) + chi(9) + chi(15) + chi(45)). This covers every number which divides evenly
into 45, once, and only once. And it works like that for any number, there’s
nothing special about 45. That’s pretty interesting, and I think wholly
unexpected. This question of counting the number of lattice points a distance sqrt(N)
away from the origin involves adding up the value of this simple function over all divisors
of N. Now, remember why we’re doing all of this.
The total number of lattice points inside a big circle with radius R should be about
pi*R2. But on the other hand, we can count those
same lattice points by looking through all numbers N between 0 and R2 and counting how
many lattice points are a distance sqrt(N) from the origin.
We’ll go ahead and ignore the origin dot, with radius 0, since it doesn’t follow the
pattern of the rest, and one little dot won’t make a difference as we let R grow towards
infinity. From all of this Gaussian integer and factoring
and chi function stuff we’ve been doing, the answer for each N looks like adding up
the value of chi on every divisor of N, and multiplying by 4.
Let’s just put that 4 in the corner for now and remember to bring it back in a moment.
Rearrange sum At first, adding up the values of all these
rows seems super random. Numbers with lots of factors have lots of divisors, primes only
have two divisors, and you might think that you’d need perfect knowledge of the distribution
of primes to get anything useful out of this. But if instead you organize these into columns,
the puzzle starts to fit together. How many numbers between 1 and R2 have 1 as
a divisors; well, all of them, so our sum should include R2*chi(1)
How many have 2 as a divisors. Well, about half of them, so that accounts for (R2/2)*chi(2)
in our total tally up. About a third of the rows have a chi(3), so
that’s (R2/3)*chi(3). Keep going like this, and the total count
of lattice points looks like R2*chi(1) +(R2/2)*chi(2) + (R2/3) * chi(3), and so on. Factoring out
that R2, and bringing back the 4 that needs to be multiplied in, this means the total
number of lattice points in this big circle is approximately 4R2(..this sum..).
Since chi is 0 on all even numbers, and oscillates between 1 and -1 for odd numbers, that sum
looks like 1 - â…“ + â…• - 1/7, and so on. This is what we wanted! An alternative expression
for the number of lattice points in a big circle, which we know to be around pi*R2
The bigger R is, the more accurate both these estimates are, so the error between these
sides can be arbitrarily small. Dividing by R2, this gives us an infinite
sum which should converge to pi. And the reason this sum is so simple, just
oscillating back and forth between odd numbers, stems from the regular pattern for how prime
numbers factor inside the Gaussian integers. If you’re curious, there are two main branches
of number theory: Algebraic number theory, and analytic number theory.
Very loosely speaking, the former deals with new number systems, like these Gaussian integers
you and I looked at, while the latter deals with things like the Riemann zeta function,
or its cousins called “L” functions which involve multiplicative functions like the
central character chi from our story. The path we just walked is a little glimpse
of where those two fields intersect. Both are pretty heavy-duty fields with lots of
active research and unsolved problems. So if this feels like something which will take
time to mentally digest, like there’s more patterns to be uncovered and understood, it’s
because it is, and there are. So, you are all demonstrably the kind of people
who watch in-depth math videos in your free time, and I know that some subportion of you
are software engineers, or soon-to-be software engineers, so before you go there’s a little
recruiting I’d like to do. This video is sponsored by Remix, which is
a planning platform for public transit. They enable cities to find the most efficient and
cost-effective ways to serve the communities and demographics they want.
And, if you think about if for a moment, doing this well can involve some seriously interesting
optimization problems. And indeed, their looking for mathematically oriented programmers who
can tackle problems involving a variety of optimization techniques so that, as one of
their engineers phrased it to me, they can “trick the universe into letting them create
quality schedules.” If you want to work with smart people on interesting
problems, take a look at their site and careers page, which I’ve linked to on the screen
and in the description.
How would this problem be shifted to 3 dimensions, or n dimensions for that matter? More specifically, how would you count the 3D lattice points that a sphere of radius R lies on? What about for an n-dimensional sphere in n-dimensions?
I knew it.
This was one of the most interesting math videos I've seen, and I've seen a ton.
Does anyone know what software could have been used to make this? It looks great.
What happens if we try this with the Eisenstein Integers? Ie, instead of a square grid, use a triangular grid?
I think it gives, 1 - 1/2 + 1/4 -1/5 + 1/7 -1/8... = Pi / (3*Sqrt(3))
I was just wondering. I knew the fact that every prime (except 2 & 3) can be written either as 6n+1 or 6n-1. The video talks about 4n+1 or 4n+3. Thinking about that (and why that is true), I came up with: every number can be written as 4n+0, 4n+1, 4n+2, or 4n+3. The 4n+0 and 4n+2 are always even, so no prime (except 2) can be written in that form, hence every prime (except 2) must be either in the form 4n+1 or 4n+3. Did I miss something there?
That's a good video.
Now I'm wondering if you can get the result by considering grouping by sloped lines instead of circles.
I don't feel comfortable with the fact that adding up points on the gridlines gives the area. I know it has something to do with the approximation becoming an equality as the radius goes to infinity, but I don't feel why this is true