Intro
This is a video for anyone who already knows what eigenvalues and eigenvectors are, and
who might enjoy a quick way to compute them in the case of 2x2 matrices. If you’re unfamiliar
with eigenvalues, take a look at this video which introduces them. You can skip ahead if you just want to see
the trick, but if possible I’d like you to rediscover it for yourself, so for that
let’s lay down a little background. As a quick reminder, if the effect of a linear
transformation on a given vector is to scale it by some constant, we call it an "eigenvector"
of the transformation, and we call the relevant scaling factor the corresponding "eigenvalue,"
often denoted with the letter lambda. When you write this as an equation and you rearrange
a little bit, what you see is that if the number lambda is an eigenvalue of a matrix
A, then the matrix (A minus lambda times the identity) must send some nonzero vector, namely
the corresponding eigenvector, to the zero vector, which in turn means the determinant
of this modified matrix must be 0. Okay, that’s all a little bit of a mouthful
to say, but again, I’m assuming all of this is review for anyone watching. So, the usual way to compute eigenvalues,
how I used to do it, and how I believe most students are taught to carry it out, is to
subtract the unknown value lambda off the diagonals and then solve for when the determinant
equals 0. Doing this always involves a few steps to
expand out and simplify to get a clean quadratic polynomial, what's known as the “characteristic
polynomial” of the matrix. The eigenvalues are the roots of this polynomial. So to find
them you have to apply the quadratic formula, which itself typically requires one or two
more steps of simplification. Honestly, the process isn’t terrible. But
at least for 2x2 matrices, there’s a much more direct way to get at this answer. And
if you want to rediscover this trick, there are only three relevant facts you need to
know, each of which is worth knowing in its own right and can help you with other problem-solving. Number 1: The trace of a matrix, which is
the sum of these two diagonal entries, is equal to the sum of the eigenvalues. Or another
way to phrase it, more useful for our purposes, is that the mean of the two eigenvalues is
the same as the mean of these two diagonal entries. Number 2: The determinant of a matrix, our
usual ad-bc formula, is equal to the product of the two eigenvalues. And this should kind
of make sense if you understand that eigenvalues describe how much an operator stretches space
in a particular direction and that the determinant describes how much an operator scales areas
(or volumes) as a whole. Now before getting to the third fact, notice
how you can essentially read these first two values out of the matrix without really writing
much down. Take this matrix here as an example. Straight away you can know that the mean of
the eigenvalues is the same as the mean of 8 and 6, which is 7. Likewise, most linear
algebra students are pretty well-practiced at finding the determinant, which in this
case works out to be 48 - 8 So right away you know that the product of our two eigenvalues
is 40. Now take a moment to see how you can derive
what will be our third relevant fact, which is how to recover two numbers when you know
their mean and you know their product. Here, let's focus on this example. You know
the two values are evenly spaced around 7, so they look like 7 plus or minus something;
let’s call that something "d" for distance. You also know that the product of these two
numbers is 40. Now to find d, notice that this product expands really nicely, it works
out as a difference of squares. So from there, you can directly find d: d^2 is 7^2 - 40,
or 9, which means d itself is 3. In other words, the two values for this very
specific example work out to be 4 and 10. But our goal is a quick trick, and you wouldn’t
want to think this through each time, so let’s wrap up what we just did in a general formula.
For any mean, m and product, p, the distance squared is always going to be m^2 - p. This
gives the third key fact, which is that when two numbers have a mean m and a product p,
you can write those two numbers as m ± sqrt(m^2 - p) This is decently fast to rederive on the fly
if you ever forget it, and it’s essentially just a rephrasing of the difference of squares
formula. But even still it’s a fact worth memorizing so that you have it at the tip
of your fingers. In fact, my friend Tim from the channel acapellascience
wrote us a quick jingle to make it a little more memorable. m plus or minus squaaaare root of me squared
minus p (ping!) Let me show you how this works, say for the
matrix [[3,1], [4,1]]. You start by bringing to mind the formula, maybe stating it all
in your head. But when you write it down, you fill in the appropriate values of m and
p as you go. So in this example, the mean of the eigenvalues is the same as the mean
of 3 and 1, which is 2. So the thing you start writing is 2 ± sqrt(2^2 - …). Then the
product of the eigenvalues is the determinant, which in this example is 3*1 - 1*4, or -1.
So that’s the final thing you fill in. This means the eigenvalues are 2±sqrt(5). You
might recognize that this is the same matrix I was using at the beginning, but notice how
much more directly we can get at the answer. Here, try another one. This time the mean
of the eigenvalues is the same as the mean of 2 and 8, which is 5. So again, you start
writing out the formula but this time writing 5 in place of m [song]. And then the determinant
is 2*8 - 7*1, or 9. So in this example, the eigenvalues look like 5 ± sqrt(16), which
simplifies even further as 9 and 1. You see what I mean about how you can basically
just start writing down the eigenvalues while staring at the matrix? It’s typically just
the tiniest bit of simplifying at the end. Honestly, I’ve found myself using this trick
a lot when I’m sketching quick notes related to linear algebra and want to use small matrices
as examples. I’ve been working on a video about matrix exponents, where eigenvalues
pop up a lot, and I realized it’s just very handy if students can read off the eigenvalues
from small examples without losing the main line of thought by getting bogged down in
a different calculation. As another fun example, take a look at this
set of three different matrices, which come up a lot in quantum mechanics, they're known
as the Pauli spin matrices. If you know quantum mechanics, you’ll know that the eigenvalues
of matrices are highly relevant to the physics they describe, and if you don’t know quantum
mechanics, let this just be a little glimpse of how these computations are actually relevant
to real applications. The mean of the diagonal in all three cases
is 0, so the mean of the eigenvalues in all cases is 0, which makes our formula look especially
simple. What about the products of the eigenvalues,
the determinants of these matrices? For the first one, it’s 0 - 1 or -1. The second
also looks like 0 - 1, but it takes a moment more to see because of the complex numbers.
And the final one looks like -1 - 0. So in all cases, the eigenvalues simplify to be
±1. Although in this case, you really don’t need the formula to find two values if you
know theyr'e evenly spaced around 0 and their product is -1. If you’re curious, in the context of quantum
mechanics, these matrices describe observations you might make about a particle's spin in
the x, y or z directions. The fact that their eigenvalues are ±1 corresponds with the idea
that the values for the spin that you would observe would be either entirely in one direction
or entirely in another, as opposed to something continuously ranging in between. Maybe you’d
wonder how exactly this works, or why you would use 2x2 matrices that have complex numbers
to describe spin in three dimensions. And those would be fair questions, just outside
the scope of what I want to talk about here. You know it’s funny, I wrote this section
because I wanted some case where you have 2x2 matrices that are not just toy examples
or homework problems, ones where they actually come up in practice, and quantum mechanics
is great for that. But the thing is after I made it I realized that the whole example
kind of undercuts the point I’m trying to make. For these specific matrices, when you
use the traditional method, the one with characteristic polynomials, it’s essentially just as fast;
it might actually faster. I mean, take a look a the first one: The relevant determinant
directly gives you a characteristic polynomial of lambda^2 - 1, and clearly, that has roots
of plus and minus 1. Same answer when you do the second matrix, lambda^2 - 1. And as
for the last matrix, forget about doing any computations, traditional or otherwise, it’s
already a diagonal matrix, so those diagonal entries are the eigenvalues! However, the example is not totally lost to
our cause. Where you will actually feel the speed up is in the more general case where
you take a linear combination of these three matrices and then try to compute the eigenvalues. You might write this as a times the first
one, plus b times the second, plus c times the third. In quantum mechanics, this would
describe spin observations in a general direction of a vector with coordinates [a, b, c]. More
specifically, you should assume this vector is normalized, meaning a^2 + b^2 + c^2 = 1.
When you look at this new matrix, it’s immediate to see that the mean of the eigenvalues is
still zero, and you might also enjoy pausing for a brief moment to confirm that the product
of those eigenvalues is still -1, and then from there concluding what the eigenvalues
must be. And this time, the characteristic polynomial approach would be by comparison
a lot more cumbersome, definitely harder to do in your head. To be clear, using the mean-product formula
is not fundamentally different from finding roots of the characteristic polynomial; I
mean, it can't be, they're solving the same problem. One way to think about this, actually,
is that the mean-product formula is a nice way to solve quadratic in general (and some
viewers of the channel may recognize this). This about it: When you’re trying to find
the roots of a quadratic given its coefficients, that's another situation where you know the
sum of two values, and you also know their product, but you’re trying to recover the
original two values. Specifically, if the polynomial is normalized
so that this leading coefficient is 1, then the mean of the roots will be -½ times this
linear coefficient, which is -1 times the sum of those roots. For the example on the
screen that makes the mean 5. And the product of the roots is even easier, it’s just the
constant term no adjustments needed. So from there, you would apply the mean product formula
and that gives you the roots. On the one hand, you could think of this as
a lighter-weight version of the traditional quadratic formula. But the real advantage
is that it's fewer symbols to memorize, it's that each one of them carries more meaning
with it. The whole point of this eigenvalue trick is
that because you can read out the mean and product directly from looking at the matrix,
you don't need to go through the intermediate step of setting up the characteristic polynomial.
You can jump straight to writing down the roots without ever explicitly thinking about
what the polynomial looks like. But to do that we need a version of the quadratic formula
where the terms carry some kind of meaning. I realize that this is a very specific trick,
for a very specific audience, but it’s something I wish I knew in college, so if you happen
to know any students who might benefit from this, consider sharing it with them. The hope is that it’s not just one more
thing to memorize, but that the framing reinforces some other nice facts worth knowing, like
how the trace and determinant relate to eigenvalues. If you want to prove those facts, by the way,
take a moment to expand out the characteristic polynomial for a general matrix, and think
hard about the meaning of each of these coefficients. Many thanks to Tim, for ensuring that this
mean-product formula will stay stuck in all of our heads for at least a few months. If
you don’t know about acapellascience, please do check it out. "The Molecular Shape of You",
in particular, is one of the greatest things on the internet.
I mean, it's the same thing as the ad-bc formula for determinants - it's going to get tricky for n>2, which is actually true for most n.
The method in the video can be really helpful for quickly reading off what's qualitatively going on in 2x2 matrix exponential calculations, since the important information for a real 2x2 matrix is the sign of the trace, the sign of the determinant, and if the determinant is positive, the sign of the discriminant T^2-4D. (Equivalently you can look at (T^2-4D)/4, which is what he called m^2-p in the video.)
So in the thumbnail example, the trace is 4, the determinant is -1, so I immediately know that there's a positive and a negative eigenvalue, which is really what I want to know for qualitative analysis.
My quick trick for computing eigenvalues is googling "eigenvalue calculator" or looking up the matlab function doc if I'm working with a really nasty matrix.
Here is his simpler quadratic formula that he mentions towards the end, which is essentially the same technique but without the context matrices. And that was based on Po Shen Loh's video on the same technique (that video is kind of overdramatic, but it's still a nice technique).
It’s essentially the same thing. I don’t think it’ll be really helpful for 2x2 as both ways are easy and can be done quickly one you get used to them. The really problem is when you want to compute for 3x3 or higher order matrices, in which case both methods are equally inefficient and difficult to solve.
Haters gonna hate, but I like the occasional silly trick. I think he's earned the right to be like "Hey, that's neat! I'm going to make a video about it, maybe someone else will think it's neat too!" His original video on eigenvalues and eigenvectors was outstanding so I cant complain that he didnt explain what they are in this video. This is like a 'oh by the way' he stumbled across.
If you dont want to use it, dont use it. I probably wont. But remember that math, at it's best, is just enjoyable to the mathematician. I enjoyed it.
Why is this allowed but not a question about a derivation in LA? The latter would probably provide just as much conversation and learning opportunity for people not aware. (If not more)
So I was told of egnevalues in uni but never explained. We aprentlt had done them in systems but we hadn't. Can someone explain to a design engineer what they are I am not a mathematician don't use math terms to much or it will go right over.
And a quick trick for inverting a matrix M: if the characteristic polynomial is, say, x3 + ax2 + bx + c then by Cayley Hamilton M3 + aM2 + bM +cI = 0. Hence M2 + aM + bI +cM-1 = 0 and you get M-1 ... except if c=0, in which case of course M is not invertible.