OK, this is the lecture
on linear transformations. Actually, linear
algebra courses used to begin with this
lecture, so you could say I'm beginning
this course again by talking about
linear transformations. In a lot of courses, those
come first before matrices. The idea of a linear
transformation makes sense without a matrix, and
physicists and other -- some people like
it better that way. They don't like coordinates. They don't want those numbers. They want to see what's going
on with the whole space. But, for most of us,
in the end, if we're going to compute anything,
we introduce coordinates, and then every
linear transformation will lead us to a matrix. And then, to all the things
that we've done about null space and row space, and
determinant, and eigenvalues -- all will come from the matrix. But, behind it --
in other words, behind this is the idea of
a linear transformation. Let me give an example of
a linear transformation. So, example. Example one. A projection. I can describe a projection
without telling you any matrix, anything about any matrix. I can describe a
projection, say, this will be a linear
transformation that takes, say, all of R^2, every
vector in the plane, into a vector in the plane. And this is the way people
describe, a mapping. It takes every vector,
and so, by what rule? So, what's the rule, is, I
take a -- so here's the plane, this is going to be my line,
my line through my line, and I'm going to project
every vector onto that line. So if I take a vector like b
-- or let me call the vector v for the moment -- the projection -- the linear
transformation is going to produce this vector as T(v). So T -- it's like a function. Exactly like a function. You give me an input,
the transformation produces the output. So transformation, sometimes the
word map, or mapping is used. A map between
inputs and outputs. So this is one particular
map, this is one example, a projection that takes
every vector -- here, let me do another vector v,
or let me do this vector w, what is T(w)? You see? There are no coordinates here. I've drawn those axes,
but I'm sorry I drew them, I'm going to remove them,
that's the whole point, is that we don't need axes,
we just need -- so guts -- get it out of there,
I'm not a physicist, so I draw those axes. So the input is w, the
output of the projection is, project on that line, T(w). OK. Now, I could think of a
lot of transformations T. But, in this linear
algebra course, I want it to be a
linear transformation. So here are the rules for
a linear transformation. Here, see, exactly,
the two operations that we can do on vectors,
adding and multiplying by scalars, the transformation
does something special with respect to
those operations. So, for example, the projection
is a linear transformation because -- for example, if I wanted
to check that one, if I took v to be twice
as long, the projection would be twice as long. If I took v to be minus -- if I changed from v to
minus v, the projection would change to a minus. So c equal to two, c equal
minus one, any c is OK. So you see that actually, those
combine, I can combine those into one statement. What the transformation does
to any linear combination, it must produce the same
combination of T(v) and T(w). Let's think about some -- I mean, it's like,
not hard to decide, is a transformation
linear or is it not. Let me give you an example so
you can tell me the answer. Suppose my transformation is
-- here's another example two. Shift the whole plane. So here are all my vectors,
my plane, and every vector v in the plane, I
shift it over by, let's say, three
by some vector v0. Shift whole plane by v0. So every vector in the plane -- this was v, T(v) will be v+v0. There's T(v). Here's v0. There's the typical v. And there's T(v). You see what this
transformation does? Takes this vector
and adds to it. Adds a fixed vector to it. Well, that seems like
a pretty reasonable, simple transformation,
but is it linear? The answer is no,
it's not linear. Which law is broken? Maybe both laws are broken. Let's see. If I double the length of v,
does the transformation produce something double --
do I double T(v)? No. If I double the length of
v, in this transformation, I'm just adding on the same
one -- same v0, not two v0s, but only one v0
for every vector, so I don't get two
times the transform. Do you see what I'm saying? That if I double this,
then the transformation starts there and only goes one
v0 out and doesn't double T(v). In fact, a linear
transformation -- what is T of zero? That's just like a special
case, but really worth noticing. The zero vector in a
linear transformation must get transformed to zero. It can't move, because,
take any vector V here -- well, so you can see
why T of zero is zero. Take v to be the zero
vector, take c to be three. Then we'd have T of
zero vector equaling three T of zero vector, the
T of zero has to be zero. OK. So, this example is
really a non-example. Shifting the whole plane is
not a linear transformation. Or if I cooked up some formula
that involved squaring, or the transformation
that, also non-example, how about the transformation
that, takes any vector and produces its length? So there's a transformation
that takes any vector, say, any vector in R^3,
let me just -- I'll just get a chance to
use this notation again. Suppose I think of the
transformation that takes any vector in R^3 and
produces this number. So that, I could say, is a
member of R^1, for example, if I wanted. Or just real numbers. That's certainly not linear. It's true that the zero
vector goes to zero. But if I double a vector,
it does double the length, that's true. But suppose I multiply
a vector by minus two. What happens to its length? It just doubles. It doesn't get
multiplied by minus two. So when c is minus
two in my requirement, I'm not satisfying
that requirement. So T of minus v is not minus
v -- minus, the length, it's just the length. OK, so that's
another non-example. Projection was an example, let
me give you another example. I can stay here and have a --
this will be an example that is a linear transformation,
a rotation. Rotation by --
what shall we say? By 45 degrees. OK? So again, let me choose
this, this will be a mapping, from the whole plane of
vectors, into the whole plane of vectors, and it just -- here is the input vector v, and
the output vector foam this 45 degree rotation is just rotate
that thing by 45 degrees, T(v). So every vector got rotated. You see that I can describe
this without any coordinates. And see that it's linear. If I doubled v, the rotation
would just be twice as far out. If I had v+w, and if I rotated
each of them and added, the answer's the same as
if I add and then rotate. That's what the linear
transformation is. OK, so those are two examples. Two examples, projection
and rotation, and I could invent more that are
linear transformations where I haven't told you a matrix yet. Actually, the book has a
picture of the action of linear transformations -- actually, the cover
of the book has it. So, in this section seven
point one, we can think of a -- actually, here let's take
this linear transformation, rotation, suppose I have, as
the cover of the book has, a house in R^2. So instead of this, let me
take a small house in R^2. So that's a whole lot of points. The idea is, with this
linear transformation, that I can see what it
does to everything at once. I don't have to just
take one vector at a time and see what T of
V is, I can take all the vectors on the
outline of the house, and see where they all go. In fact, that will show me
where the whole house goes. So what will happen with
this particular linear transformation? The whole house will rotate, so
the result, if I can draw it, will be, the house
will be sitting there. OK. And, but suppose I give
some other examples. Oh, let me give some examples
that involve a matrix. Example three -- and
this is important -- coming from a matrix
at -- we always call A. So the transformation
will be, multiply by A. There is a linear
transformation. And a whole family of
them, because every matrix produces a transformation
by this simple rule, just multiply every vector by
that matrix, and it's linear, right? Linear, I have to
check that A(v) -- A times v plus w equals Av
plus A w, which is fine, and I have to check that
A times vc equals c A(v). Check. Those are fine. So there is a linear
transformation. And if I take my
favorite matrix A, and I apply it to all
vectors in the plane, it will produce a
bunch of outputs. See, the idea is
now worth thinking of, like, the big picture. The whole plane is transformed
by matrix multiplication. Every vector in the plane
gets multiplied by A. Let's take an example,
and see what happens to the vectors of the house. So this is still a
transformation from plane to plane, and let me take
a particular matrix A -- well, if I cooked up
a rotation matrix, this would be the right picture. If I cooked up a
projection matrix, the projection would
be the picture. Let me just take
some other matrix. Let me take the matrix
one zero zero minus one. What happens to the house, to
all vectors, and in particular, we can sort of visualize it
if we look at the house -- so the house is not rotated
any more, what do I get? What happens to all the vectors
if I do this transformation? I multiply by this matrix. Well, of course, it's an
easy matrix, it's diagonal. The x component stays the same,
the y component reverses sign, so that like the
roof of that house, the point, the tip of the
roof, has an x component which stays the same, but its
y component reverses, and it's down here. And, of course, what
we get is, the house is, like, upside down. Now, I have to put --
where does the door go? I guess the door goes
upside down there, right? So here's the input,
here's the input house, and this is the output. OK. This idea of a
linear transformation is like kind of the
abstract description of matrix multiplication. And what's our goal here? Our goal is to understand
linear transformations, and the way to
understand them is to find the matrix
that lies behind them. That's really the idea. Find the matrix that
lies behind them. Um, and to do that, we have
to bring in coordinates. We have to choose a basis. So let me point out
what's the story -- if we have a linear
transformation -- so start with -- start. Suppose we have a
linear transformation. Let -- from now on, let T stand
for linear transformations. I won't be interested
in the nonlinear ones. Only linear transformations
I'm interested in. OK. I start with a linear
transformation T. Let's suppose its inputs
are vectors in R^3. OK? And suppose its outputs are
vectors in R^2, for example. OK. What's an example of
such a transformation, just before I go any further? Any matrix of the right
size will do this. So what would be the
right shape of a matrix? So, for example -- I'm wanting to give
you an example, just because, here, I'm
thinking of transformations that take three-dimensional
space to two-dimensional space. And I want them to be linear,
and the easy way to invent them is a matrix multiplication. So example, T of
v should be any A v. Those transformations
are linear, that's what 18.06 is about. And A should be what size, what
shape of matrix should that be? I want V to have
three components, because this is what
the inputs have -- so here's the input in R^3,
and here's the output in R^2. So what shape of matrix? So this should be, I guess,
a two by three matrix? Right? A two by three matrix. A two by three matrix, we'll
multiply a vector in R^3 -- you see I'm moving to
coordinates so quickly, I'm not a true physicist here. A two by three matrix, we'll
multiply a vector in R^3 an produce an output in
R^2, and it will be a linear transformation, and OK. So there's a whole
lot of examples, every two by three matrix
give me an example, and basically, I want
to show you that there are no other examples. Every linear transformation
is associated with a matrix. Now, let me come back to the
idea of linear transformation. Suppose I've got this linear
transformation in my mind, and I want to tell
you what it is. Suppose I tell you what
the transformation does to one vector. OK. You know one thing, then. All right. So this is like the -- what
I'm speaking about now is, how much information is needed
to know the transformation? By knowing T, I -- to know T of v for all v. All inputs. How much information
do I have to give you so that you know what
the transformation does to every vector? OK, I could tell you what
the transformation -- so I could take a vector
v1, one particular vector, tell you what the
transformation does to it -- fine. But now you only know what
the transformation does to one vector. So you say, OK,
that's not enough, tell me what it does
to another vector. So I say, OK, give me a vector,
you give me a vector v2, and we see, what does the
transformation do to v2? Now, you only know --
or do you only know what the transformation
does to two vectors? Have I got to ask you --
answer you about every vector in the whole input
space, or can you, knowing what it
does to v1 and v2, how much do you now know
about the transformation? You know what the
transformation does to a larger bunch of
vectors than just these two, because you know what it does
to every linear combination. You know what it does, now,
to the whole plane of vectors, with bases v1 and v2. I'm assuming v1 and
v2 were independent. If they were dependent,
if v2 was six times v1, then I didn't give you any
new information in T of v2, you already knew it would
be six times T of v1. So you can see what
I'd headed for. If I know what the
transformation does to every vector in a basis,
then I know everything. So the information needed to
know T of v for all inputs is T of v1, T of v2, up to T
of vm, let's say, or vn, for any basis -- for a basis v1 up to vn. This is a base for any -- can I call it an input basis? It's a basis for
the space of inputs. The things that T is acting on. You see this point, that if
I have a basis for the input space, and I tell you what
the transformation does to every one of those
basis vectors, that is all I'm allowed to tell you,
and it's enough to know T of v for all v-s, because why? Because every v is some
combination of these basis vectors, c1v1+...+cnvn,
that's what a basis is, right? It spans the space. And if I know what T does to
this, and what T does to v2, and what T does to vn, then
I know what T does to V. By this linearity, it has
to be c1 T of v1 plus O one plus cn T of vn. There's no choice. So, the point of this
comment is that if I know what T does to a basis,
to each vector in a basis, then I know the linear
transformation. The property of linearity
tells me all the other vectors. All the other outputs. OK. So now, we got -- so
that light we now see, what do we really need in
a linear transformation, and we're ready to go to a matrix. OK. What's the step
now that takes us from a linear
transformation that's free of coordinates to a
matrix that's been created with respect to coordinates? The matrix is going to come
from the coordinate system. These are the coordinates. Coordinates mean a
basis is decided. Once you decide on a basis -- this is where coordinates come from. You decide on a basis,
then every vector, these are the coordinates
in that basis. There is one and only
one way to express v as a combination of
the basis vectors, and the numbers you
need in that combination are the coordinates. Let me write that down. So what are coordinates? Coordinates come from a basis. Coordinates come from a basis. The coordinates of v,
the coordinates of v are these numbers that tell you
how much of each basis vector is in v. If I change the basis, I
change the coordinates, right? Now, we have always
been assuming that were working with
a standard basis, right? The basis we don't even
think about this stuff, because if I give you the
vector v equals three two four, you have been
assuming completely -- and probably rightly -- that I
had in mind the standard basis, that this vector was three times
the first coordinate vector, and two times the second,
and four times the third. But you're not entitled -- I might have had some
other basis in mind. This is like the standard basis. And then the coordinates
are sitting right there in the vector. But I could have chosen
a different basis, like I might have had
eigenvectors of a matrix, and I might have said,
OK, that's a great basis, I'll use the eigenvectors
of this matrix as my basis vectors. Which are not necessarily these
three, but some other basis. So that was an example,
this is the real thing, the coordinates
are these numbers, I'll circle them again,
the amounts of each basis. OK. So, if I want to
create a matrix that describes a linear
transformation, now I'm ready to do that. OK, OK. So now what I plan to do
is construct the matrix A that represents, or tells me
about, a linear transformation, linear transformation T. OK. So I really start with
the transformation -- whether it's a
projection or a rotation, or some strange movement
of this house in the plane, or some transformation from
n-dimensional space to -- or m-dimensional space
to n-dimensional space. n to m, I guess. Usually, we'll have T, we'll
somehow transform n-dimensional space to m-dimensional space,
and the whole point is that if I have a basis for
n-dimensional space -- I guess I need
two bases, really. I need an input basis
to describe the inputs, and I need an output basis
to give me coordinates -- to give me some
numbers for the output. So I've got to choose two bases. Choose a basis v1 up
to vn for the inputs, for the inputs in -- they came from R^n. So the transformation is taking
every n-dimensional vector into some m-dimensional vector. And I have to choose a basis,
and I'll call them w1 up to wn, for the outputs. Those are guys in R^m. Once I've chosen the basis,
that settles the matrix -- I now working with coordinates. Every vector in R^n, every input
vector has some coordinates. So here's what I do,
here's what I do. Can I say it in words? I take a vector v. I express it in its
basis, in the basis, so I get its coordinates. Then I'm going to multiply those
coordinates by the right matrix A, and that will give me the
coordinates of the output in the output basis. I'd better write that
down, that was a mouthful. What I want -- I want a matrix A that does what
the linear transformation does. And it does it with
respecting these bases. So I want the matrix to be --
well, let's suppose -- look, let me take an example. Let me take the
projection example. The projection example. Suppose I take -- because we've got that -- we've got that
projection in mind -- I can fit in here. Here's the projection example. So the projection example, I'm
thinking of n and m as two. The transformation
takes the plane, takes every vector in the plane,
and, let me draw the plane, just so we remember
it's a plane -- and there's the thing
that I'm projecting onto, that's the line I'm
projecting onto -- so the transformation takes
every vector in the plane and projects it onto that line. So this is projection, so
I'm going to do projection. OK. But, I'm going to choose
a basis that I like better than the standard basis. My basis -- in fact, I'll
choose the same basis for inputs and for outputs, and
the basis will be -- my first basis vector
will be right on the line. There's my first basis vector. Say, a unit vector, on the line. And my second basis vector will
be a unit vector perpendicular to that line. And I'm going to choose
that as the output basis, also. And I'm going to ask
you, what's the matrix? What's the matrix? How do I describe this
transformation of projection with respect to this basis? OK? So what's the rule? I take any vector v,
it's some combination of the first basis ve- vector,
and the second basis vector. Now, what is T of v? Suppose the input is -- well,
suppose the input is v1. What's the output? v1, right? The projection leaves
this one alone. So we know what the projection
does to this first basis vector, this guy, it leaves it. What does the projection do
to the second basis vector? It kills it, sends it to zero. So what does the projection
do to a combination? It kills this part, and
this part, it leaves alone. Now, all I want to do
is find the matrix. I now want to find
the matrix that takes an input, c1
c2, the coordinates, and gives me the output, c1 0. You see that in this basis,
the coordinates of the input were c1, c2, and the coordinates
of the output are c1, And of course, not hard to find
a matrix that will do that. The matrix that will do that
is the matrix one, zero, zero, zero. Because if I multiply
input by that matrix A -- this is A times
input coordinates -- and I'm hoping to get
the output coordinates. And what do I get from
that multiplication? I get the right
answer, c1 and zero. So what's the point? So the first point is, there's
a matrix that does the job. If there's a linear
transformation out there, coordinate-free, no
coordinates, and then I choose a basis for
the inputs, and I choose a basis for
the outputs, then there's a matrix that
does And what's the job? the job. It multiplies the input
coordinates and produces the output coordinates. Now, in this example
-- let me repeat, I chose the input basis was
the same as the output basis. The input basis and output
basis were both along the line, and perpendicular to the line. They're actually the
eigenvectors of the projection. And, as a result, the
matrix came out diagonal. In fact, it came
out to be lambda. This is like, the good basis. So the good -- the eigenvector
basis is the good basis, it leads to the matrix -- the diagonal matrix
of eigenvalues lambda, and just as in this example,
the eigenvectors and eigenvalues of this linear transformation
were along the line, and perpendicular. The eigenvalues
were one and zero, and that's the
matrix that we got. OK. So that's a, like, the
great choice of matrix, that's the choice a physicist
would do when he had to finally -- he or she had to
finally bring coordinates in unwillingly, the
coordinates to be chosen, the good coordinates
are the eigenvectors, because, if I did this
projection in the standard basis -- which I could do, right? I could do the whole thing
in the standard basis -- I better try, if I can do that. What are we calling -- so I'll have to tell you now
which line we're projecting on. Say, the 45 degree line. So say we're projecting
onto 45 degree line, and we use not the eigenvector
basis, but the standard basis. The standard basis, v1, is
one, zero, and v2 is zero, one. And again, I'll use the
same basis for the outputs. Then I have to do this -- I can find a matrix,
it will be the matrix that we would
always think of, it would be the projection matrix. It will be, actually, it's the
matrix that we learned about in chapter four, it's
what I call the matrix -- do you remember, P was A, A
transpose over A transpose A? And I think, in this
example, it will come out, one-half, one-half,
one-half, one-half. I believe that's the matrix
that comes from our formula. And that's the matrix
that will do the job. If I give you this input,
one, zero, what's the output? The output is
one-half, one-half. And that should be
the right projection. And if I give you
the input zero, one, the output is, again, one-half,
one-half, again the projection. So that's the matrix, but
not diagonal of course, because we didn't
choose a great basis, we just chose the
handiest basis. Well, so the course
has practically been about the handiest
basis, and just dealing with the matrix that we got. And it's not that bad a
matrix, it's symmetric, and it has this P
squared equal P property, all those things are good. But in the best basis, it's easy
to see that P squared equals P, and it's symmetric,
and it's diagonal. So that's the idea
then, is, do you see now how I'm associating a
matrix to the transformation? I'd better write the rule down,
I'd better write the rule down. The rule to find the matrix A. All right, first column. So, a rule to find A,
we're given the bases. Of course, we don't -- because
there's no way we could construct the matrix until
we're told what the bases are. So we're given the input basis,
and the output basis, v1 to vn, w1 to wm. Those are given. Now, in the first column of
A, how do I find that column? The first column of the matrix. So that should tell me what
happens to the first basis vector. So the rule is, apply the
linear transformation to v1. To the first basis vector. And then, I'll write it --
so that's the output, right? The input is v1,
what's the output? The output is in
the output space, it's some combination
of these guys, and it's that combination that
goes into the first column -- so, let me -- I'll put this word -- right,
I'll say it in words again. How to find this matrix. Take the first basis vector. Apply the
transformation, then it's in the output space,
T of v1, so it's some combination of these
outputs, this output basis. So that combination,
the coefficients in that combination will be
the first column -- so a1, a row 2,
column 1, w2, am1, wm. There are the numbers in the
first column of the matrix. Let me make the point by
doing the second column. Second column of A. What's the idea, now? I take the second basis vector,
I apply the transformation to it, that's in --
now I get an output, so it's some combination
in the output basis -- and that combination is the
bunch of numbers that should go in the second column
of the matrix. OK. And so forth. So I get a matrix,
and the matrix I get does the right job. Now, the matrix constructed that
way, and following the rules of matrix multiplication. The result will be that if I
give you the input coordinates, and I multiply by the matrix,
so the outcome of all this is A times the input
coordinates correctly reproduces the output coordinates. Why is this right? Let me just check
the first column. Suppose the input coordinates
are one and all zeros. What does that mean? What's the input? If the input coordinates
are one and other -- and the rest zeros, then
the input is v1, right? That's the vector that has
coordinates one and all zeros. OK? When I multiply A by
the one and all zeros, I'll get the first column of
A, I'll get these numbers. And, sure enough, those are the
output coordinates for T of v1. So we made it right
on the first column, we made it right on
the second column, we made it right on
all the basis vectors, and then it has to be
right on every vector. So there is a picture of
the matrix for a linear OK. transformation. Finally, let me
give you another -- a different linear
transformation. The linear transformation
that takes the derivative. That's a linear transformation. Suppose the input space
is all combination c1 plus c2x plus c3 x squared. So the basis is these
simple functions. Then what's the output? Is the derivative. The output is the derivative,
so the output is c2+2c3 x. And let's take as output
basis, the vectors one and x. So we're going from a
three-dimensional space of inputs to a
two-dimensional space of outputs by the derivative. And I don't know
if you ever thought that the derivative is linear. But if it weren't linear,
taking derivatives would take forever, right? We are able to compute
derivatives of functions exactly because we know it's
a linear transformation, so that if we learn the
derivatives of a few functions, like sine x and cos
x and e to the x, and another little
short list, then we can take all their
combinations and we can do all the derivatives. OK, now what's the matrix? What's the matrix? So I want the matrix to
multiply these input vectors -- input coordinates, and give
these output coordinates. So I just think, OK, what's
the matrix that does it? I can follow my rule
of construction, or I can see what the matrix is. It should be a two by
three matrix, right? And the matrix -- so I'm just figuring
out, what do I want? No, I'll -- let
me write it here. What do I want from my matrix? What should that matrix do? Well, I want to get c2
in the first output, so zero, one, zero will do it. I want to get two c3, so
zero, zero, two will do it. That's the matrix for
this linear transformation with those bases and
those coordinates. You see, it just clicks, and the
whole point is that the inverse matrix gives the inverse to
the linear transformation, that the product of two
matrices gives the right matrix for the product of
two transformations -- matrix multiplication
really came from linear transformations. I'd better pick up on that
theme Monday after Thanksgiving. And I hope you have
a great holiday. I hope Indian
summer keeps going. OK, see you on Monday.