The following content is
provided under a Creative Commons license. Your support will help
MIT OpenCourseWare continue to offer high quality
educational resources for free. To make a donation or
view additional materials from hundreds of MIT courses,
visit MIT OpenCourseWare at ocw.mit.edu. HERBERT GROSS: Hi. Today we're going to begin
our study of higher order differential equations. In particular,
we're going to talk about a special equation
called the linear differential equation, which
came up, by the way, you'll notice, in first order
equations as a special type. But we're going to
generalize that. And the only liberty I'm going
to take during the lecture is to restrict our study to the
case of differential equations of order two so that we won't
have unwieldy expressions all over the board. The idea is that what
happens in the case of n equals 2 happens
for all orders n, except for a little
modification in the algebra. But we'll talk about that
more in the exercises. At any rate, for the
time being we simply call today's lecture Linear
Differential Equations. And now I must
define for you what I mean by a linear
differential equation. And I'll motivate this
more as we go along. For the time being, a
linear differential equation is simply one which has
this very special form. What you have is a y double
prime term appearing, then some function of x alone
multiplying y prime, plus some function of x alone
multiplying y, and on the right some function of x alone. And an analogous result
would hold for higher order. In other words, you have
the various derivatives of y appearing, y by itself. The coefficients are always
functions of x alone. And no power is-- or
no derivative or y is raised to any power. In particular, notice here--
it's a very small point-- but notice here that I
assume that the leading coefficient here is 1. For example, somebody
might have said, couldn't you have
had some function of x times y double prime? The answer is yes, I could have. But I'm assuming that
if I had a function, say, r of x multiplying
y double prime, I could have
multiplied-- divided both sides of this equation
through by that coefficient and wound up with an equation
of this particular type. In other words, without loss
of generality-- and this is very important, though. Many of my theorems are
going to be messed up a little bit if the coefficients
of y double prime isn't 1. In other words, the algebra will
become a little bit tougher. I'll speak about that later in
the lecture if I remember to. But the idea is,
for the time being this is all we mean by a
linear differential equation. And perhaps the best way
to emphasize what we really mean by linear
differential equation is to show what we mean by
a nonlinear differential equation-- a differential
equation which is not linear. For example, y double
prime plus y prime squared plus y equals sine x
would not be a linear equation, because the y prime term is
being raised to a power other than the first power. See, it's squared. y double prime plus y times
y prime equals e to the x is not a linear equation,
because the multiplier of y prime is y as opposed to
just a function of x, which is what the definition calls for. y double prime plus x y prime
plus y squared equals x cubed is not linear. And the reason that
it's not linear is because y appears to a
power higher than the first. Maybe I should circle
the things that prevent these things from
being linear so that you can see what's happening here. y double prime plus e to the x y
prime plus x cubed y equals tan y is not linear, because even
though the left-hand side is fine, notice that the function
on the right hand side is a function of y-- depends on y rather
than just x alone. And to finish this off, an
example of a linear equation would be what? Well, almost this one-- y double prime plus e
to the x y prime plus x cubed y equals tangent x. You see a y double prime term,
y prime to the first power being multiplied by a
function of x alone, y to the first power being
multiplied by a function of x alone, and the right-hand
side being a function of x explicitly by itself. And notice, by the
way, don't get caught in a psychological hangup here. When I say linear, it's
modifying the equation, not what the
coefficients look like. In other words, certainly
we don't think of e to the x as a linear function. We don't think of x
cubed as being linear. We don't think of tangent
x as being linear. The coefficients do not have
to be linear functions of x. They have to be
functions of x alone. It's the equation
that's called linear. And maybe the best
way to explain that is in the following way. Let's look at y double prime
plus p of xy prime plus q of xy. My claim is I can think of
this as a function machine where the input is y, and the
machine is told, look it-- whatever y comes in,
differentiate it twice, add on p of x times the
first derivative plus q of x times the function,
and that will be the output. In other words, I could
think of a machine where the input of
the machine is y-- and let me call the machine the
L machine to emphasize the word Linear-- see, what L does is this. The domain of L are functions
which are twice differentiable. After all, for
this to make sense I have to be able to
differentiate y at least twice. So the input of the L machine is
a twice differentiable function of x, and the output
is some function of x. You see y prime, y double
prime are functions of x, p and q are functions of
x, y is a function of x. All these things combine, then,
to give you a function of x. Let me show how this
machine works for example. If the L machine is
y double prime plus e to the x y prime
plus x cubed y-- [? see, ?] L of y equals this-- if I feed in sine
x to the L machine, what does the L machine do? It takes sine x, it
differentiates it twice, adds on e to the x times
the first derivative, and x cubed times the function
itself, which is sine x. Going through this operation,
which is trivial, L of sine x would be x cubed minus 1 sine
x plus e to the x cosine x. By the way, to
help you understand what we mean by a solution,
notice that what we're saying is that if we were to refer back
to this particular equation, y equals sine x would not be
a solution of this equation, because when I feed sine
x into the L machine, I do not get tangent x, which
is what a solution would mean. In other words, in terms of our
new notation, this says L of y equals tan x. So anything which
is a solution means if I shove in y to the
machine, the output would have to be tan x. To look at this from a
different perspective, if the right-hand side of
my equation had been this-- in other words, if tan
x had been replaced by x cubed minus 1 sine x
plus e to the x cosine x, then y equals sine x
would have been a solution of this particular equation. But enough about that. And we'll emphasize that
more in the exercises. The key point as to why
the word "linear" is used is that linear modifies our L
machine, not the coefficients of the differential equation. See, what did
"linear" mean when we were dealing with ordinary
linear change of variables back in block four of the
course, when we talked about u equals ax
plus by and talked about linear approximations? "Linear" meant,
if the mapping was linear L of a constant
times the input would be the constant
times L of the input. In other words, you could
factor the constant out. Notice, by the way, if I
feed cu into my L machine, what will the output be? See, be very careful here. Don't be blinded by the y. y stands for the
placeholder, the input. In other words, L of anything
is the second derivative of that anything plus p times
the first derivative plus q times that input, the
anything that I put in here. So notice that L of cu
would be the quantity cu double prime plus p of x times
cu prime plus q of x times cu. In other words, L of cu is
equal to this expression here. Notice that differentiating a
constant times a function of x means that we skip
over the constants and just differentiate
the function of x. In other words,
differentiating this twice would just be the
constant times u double prime. I can take the
constants out here, I can take the
constants out here. In other words, c factors out. That's where linearity comes in. This is linear in c. I factor the c out,
what's left is what? u double prime plus
p u prime plus qu. That's what we're
calling L of u. So L of c times u
is c times L of u, which is a linear property. And by the way, again
let me emphasize why linearity definition
was given the way it was. For example-- I just
said here that it's not true of nonlinear--
but for example, suppose I had an
equation that had y prime multiplied
by a function of y rather than a function of x. In other words, let's look at
L of y equals y times y prime. If I now replace y
by c times u, this becomes what? y is
replaced by c times u, y prime is replaced by
the derivative of cu. Well, what is the derivative
of cu if c is a constant? It's c times u prime. In other words,
the right-hand side here is just c squared
u times u prime. u times u prime is L of u. In other words, in this example
L of cu would not be c L of u. It would be c
squared L of u, which is not the linear property
that we're talking about. Finally, the other
property of linearity is that if I replace my
input by the sum of two differentiable functions, u1
and u2, then L of u1 plus u2 turns out to be L
of u1 plus L of u2. And the proof, again, is just
by looking at the definition. If I feed u1 plus u2 into
my L machine, I have what? u1 want plus u2 double
prime plus p of x times u1 plus u2 prime plus q
of x times u1 plus u2. The derivative of the sum is
the sum of the derivatives, so I can split this up into
u1 double prime plus u2 double prime. Similarly, this is p
u1 prime p u2 prime. This is q u1, q u2. I break down the
u1 terms together, the u2 terms together. This by definition is L of u1. This, by definition
of L, is L of u2. In other words, L of u1 plus
u2 is L of u1 plus L of u2. By the way,
conditions one and two can be stated equivalently
by the one statement L of c1u1 plus c2u2 is c1
L of u1 plus c2 L of u2. The proof will be
left as an exercise. It's a fairly
trivial observation. And I think, as
you may remember, that we did something at least
similar to this in block four when we talked about
linear mappings when we were mapping the xy
plane into the uv plane, OK? But at any rate, let's
see what all this means. Why are these
properties so important? And so let's call our subtopic
Properties of Linear Equations. And the first thing that
I'd like to point out is that if the right-hand side
of our linear differential equation is 0, the
interesting fact is that any linear combination
of solutions of that equation is again a solution. In other words, if I find some
function u1 of x that satisfies the equation L of y equals 0--
in other words L of u1 is 0-- and I also know
that L of u2 is 0, then the amazing thing is that
L of c1u1 plus c2u2 is also 0. In other words, any linear
combination of u1 and u2, where u1 and u2 are
solutions of L of y equals 0, will also be a solution
of L of y equals 0. And the proof,
again, is trivial. Namely, by the
properties of linearity, L of c1u1 plus c2u2 is what? It's c1 L of u1 plus c2 L of u2. L of u1 is 0. L of u2 is 0-- that
was given, you see. Consequently, this is 0. See, a constant times 0 plus
a constant times 0 is 0. And that's exactly what we
wanted to show over here. A second important factor is
that if I have a solution of L of y equals 0--
say, L of u is 0-- and I also have a solution v
of the equation L of y equals f of x-- in other words, if L of
u is 0 and L of v is f of x-- the amazing thing is, that
if I add these two solutions together, the
resulting function will be a solution of the equation
L of y equals f of x. In other words, if L of u
is 0 and L of v is f of x, then L of u plus v is
also equal to f of x. What this means
is that, whenever I add onto a solution of
this equation, any solution of this equation
I again get back a solution of this equation. The proof, again, by definition
of linearity is trivial. Namely, by linearity L of u
plus v is L of u plus L of v. But we're given
that L of u is 0. We're given that
L of v is f of x. And consequently, 0
plus f of x is f of x. So L of u plus v is
f of x, as asserted. By the way, let me
point out here-- do not overlook the
power of linearity. There is a very big danger
that what you might say here is, wasn't this result true
just by adding equals to equals? In other words, couldn't I
just add these two results? And the answer is yes, you can. But when you add
equals to equals here, notice that what
you get is what? L of u plus L of v is
equal to 0 plus f of x. In other words, what you can
prove by equals added to equals is that L of u plus
L of v is f of x. It was linearity that allowed us
to say that L of u plus L of v was the same as L of
u plus v. And to show you that in terms
of a simple analogy, let's suppose I
take the function f to be defined by f of x is
x minus 2 times x minus 3. Well, trivially, when x is 2
or x is 3, f of x is 0, right? In other words, f of
2 is 0, f of 3 is 0. Now, by equals
added to equals, you can certainly say that f
of 2 plus f of 3 is zero. But you can't say that f of
the quantity 2 plus 3 is 0. In fact, 2 plus 3 is 5. If I put 5 in here, I get what? f of 5 is 5 minus
2 times 5 minus 3. That's 3 times 2, or 6. In other words, f of 2 plus
f of 3 is 0, but f of 2 plus 3 is not 0, it's 6. You see, equals added to
equals gives you this result. But it's linearity
that you need to be able to get from here to here. This was not a linear
function, you see. Well, let me give you an example
of how to use all this stuff. Let's find all solutions of
the differential equation y double prime minus 4 y
prime plus 3y equals 0. Special case where the
right-hand side is 0. My coefficients of y
and y prime, notice, are still functions of x. They happen to be constant,
but the requirement that p of x and q
of x be constants is certainly not outlawed. In other words, one special
case of a function of x is the function of x, which
is identically a constant. So this qualifies as
a linear equation. Let me show you, then,
how I tried to find all solutions of this equation. As we've mentioned before
both in part one of our course and as a motivation for
defining e to the ix, a trial solution
of this equation involves using e to the rx. Because we differentiate e to
the rx, you get e to the rx back again. If I differentiate e to the
rx, I get r e to the rx. Differentiate again, I
get r squared e to the rx. I plug that into here,
factor out the e to the rx, and I wind up with e to
the rx times r squared minus 4r plus 3 must equal 0. Since e to the rx
is not 0, it must be that r squared
minus 4r plus 3 is 0. And that says that
either r is 1 or r is 3. Remembering what r is, it means
that my trial solutions should be correct when
r is 1 or r is 3. Leaving the details
for you to verify, it does turn out that
L of e to the x is 0-- see, r is 1-- L of e to the 3x-- r is 3-- is 0. In other words, if I were
to replace y by e to the x or by e to the 3x, this
equation is satisfied. And in terms of our
definition of L, this is the abbreviation
for writing this. Well, by our first property, the
fact that this satisfies L of y equals 0 and this satisfies
L of y equals 0, we had what? Any linear combination of
these two must satisfy L of y equals 0. In other words, I now
know in one fell swoop that every function of the
form c1 e to the x plus c2 e to the 3x must also be a
solution of this equation. OK? By the way, that's
another motivation-- and why we'll be doing
this in block eight-- for going further into the
study of vector spaces. Notice, in a sense
what you're saying is, you have found a whole
family of solutions which are linear
combinations of e to the x and e to the
3x that somehow or other e to the 3x and e to
the x behave like i and j did in the plane. Namely, every solution of this
type can be written as what? A constant times e to the x plus
a constant times e to the 3x. This is like a
two-dimensional vector space. But I'm not going to pursue
that any further right now. I just wanted to give you a
preview of coming attractions. But at any rate, to summarize
what we've done so far is that we have now found
that one family of solutions of that equation is y
equals c1 e to the x plus c to e to the 3x. I emphasize "one family,"
because so far, these are the solutions I found by
assuming that the solution had to have the form
y is e to the rx. I don't know as
yet whether there are other types of solutions. I'll worry about that
in a little while. At any rate, the next most
natural question to ask here is this-- look it. I have two arbitrary constants. And because I have two
arbitrary constants, it seems to me that, just
like in the first order case, not only should I be able
to find a solution that passes through a
particular point, but I have another degree of
freedom to play around with. Maybe I can require, not
only if the curve passes through a given point, but that
it have a particular slope when it passes through that point. In other words,
with one constant I could make it pass
through a point. With two, maybe I could make
it pass through a point having a given slope. So the question now is,
can I determine c1 and c2, such that for a
given point x0, y0 I can find a curve, a
member of this family, which passes through
the point x0, y0, and has slope equal to z0. Why I use z0 instead of
m here will become clear, I hope, in a few moments. But the point is, to
see if I can satisfy this system of
equations, let's see what happens if I replace
x and y by x0 and y0. One of my equations that must be
satisfied by c1 and c2 is this. The derivative of
this, which is a slope, is c1 e to the x plus
3 c2 e to the 3x. So if the slope is going to
be z0, when x is equal to x0 this equation must
also be obeyed. Consequently, I must now solve-- see if I can solve these
two equations for c1 and c2. By the way, this is very
important to notice. Notice that this is two
equations and two unknowns, and that my unknowns
are c1 and c2-- that once I specify
x0, y0, and z0, everything else is a known
number in this problem. To see whether this
has a unique solution, the determinant of coefficients
must be unequal to 0. But look at what that
determinant of coefficients is. It's e x0 e 3x0, e x0, 3e 3x0. That determinant is what? 3e to the 4x0
minus e to the 4x0. That's twice e to the 4x0. And since the exponential
can never be negative-- can never be 0, this
determinant cannot be 0. In other words, there is a
unique member of the family y equals c1 e to the x
plus c2 e to the 3x that passes through the point
x0, y0 with a given slope z0. That's what we prove over here. The only question
that comes up is, is that we have now shown what? That at every point
in space there is one solution from
this family that passes through the given
point with any given slope that you wanted to have. The question is
that before we can call this the
general solution, we have to be sure
that there are no other solutions
to the equation y double prime minus 4 y
prime plus 3y equals 0. In other words, the
question is are there other types of solutions-- solutions that
aren't of the form e to the x, or e to the 3x, or
linear combinations thereof? And the crucial theorem is this. Suppose we can write our second
order equation in the form y double prime is some function
of x, y, and y prime. And notice how analogous
this is to the key theorem of our last lecture. Suppose when we treat F
as a function of the three independent variables
x, y, and z. It turns out that
F, F sub y, F sub z, are all continuous
in some region R. Then the amazing result
is that for each triplet-- x0, y0, z0, in R-- see, R is
three-dimensional here, because F is defined
on three space-- there is a unique solution curve
which passes through the point x0, y0 with slope equal to z0. Again, notice here
the coding system. We are not talking
about a solution passing through x0, y0, z0. The solution curve
is in the plane. See, dy dx is a slope
of a curve on the plane. We have one
independent variable. What we're saying is what? That there is a unique solution
that passes through the point x0, y0 with slope equal to z0. Under those conditions, the
solution would be unique. Well, let's accept the
truth of this theorem and apply it to a linear
differential equation. Given the most general
linear differential equation, to put it into the form
of the crucial theorem I transpose everything
but y double prime onto the right-hand
side of the equation. I get y double prime
is f of x minus p of x y prime minus q of xy. Therefore, my
capital F of x, y, z is obtained simply by
replacing y prime by z in here. See, I simply replace y prime
by z, like I did over here. I wind up with what? Capital F of x, y, z is little
f of x minus pz minus qy. Well look it. This is certainly a continuous
function as soon as f, p, and q are continuous. The partial of capital
F with respect to y-- remember, I'm treating x, y,
and z as independent variables. If I differentiate
with respect to y, notice that this drops out. The partial of capital F with
respect to y is just minus q of x. The partial of capital f with
respect to z is just minus p of x. Consequently, if
little f, p, and q are all continuous functions
of x, automatically capital F, capital F sub y, capital F sub
z will be continuous functions. Consequently, according
to that crucial theorem, any solution that I find
will be a unique solution. And even more to the point, even
if I can't find the solution, the theorem tells me that
there is a unique solution. Well, I think that the p's
and the q's sort of gets you a little bit messed up. Let's do a specific
illustration of this again. Let's find now all
solutions of the equation y double prime minus 4 y prime
plus 3y equals e to the 2x. The same equation as before,
only the right-hand side is e to the 2x rather than 0. The key result is
going to be this. I already know how to
solve this equation when the right-hand side is 0. Remember, one of my properties
of linearity was that if I could find any solution
of this equation, then by adding onto it any
solution of the equation where the right-hand side is 0-- by the way, when the
right hand side is 0, the equation is
called homogeneous. And I don't think that's
too important now, then, to know the language
when it's mentioned, but this is called homogeneous
if the right-hand side is 0. What our key theorem
says is, look it-- we have solved this problem. We have found the
general solution when the right-hand side is 0. Consequently, if I could
find just one solution of this equation, by hook
or by crook, just steal one, sneak one in. If I can find one
solution of that equation, if I add that onto
the general solution of the homogeneous
equation, that will be the general
solution of this equation. What does that mean? Let's see how we'll tackle this. What I'm going to do is,
I look at this equation. And here's how I get sneaky. I say, look it. When I differentiate, and
I'm all through, what I want to wind up with is e to the 2x. Well, the only function
whose derivative gives you a factor of e
to the 2x, more or less, is some constant times
e to the 2x itself. So I say to myself, let me try
for a solution in the form y equals some constant
times e to the 2x. See, my trial solution
will have this form. Well, yT prime would be this,
yT double prime would be this. If I now substitute back
into the original equation, yT double prime minus
4 yT prime plus 3yT must be identically
equal to e to the x. That leads to the fact
that minus A e to the 2x must be the same as e to the 2x. And that tells me that
if there is a solution, A had better be minus 1. You see, if x is 0 here,
this is minus A equals 1. A quick check will show
that minus e to the 2x is a solution of the equation. So a particular solution of this
equation is minus e to the 2x. I've found one solution. Now, here's the power
of all this theory. With this one
solution, I now go back to the homogeneous
equation, which had as its general solution c1
e to the x plus c2 e to the 3x. I add these two together, and
that is the general solution. That is the general
solution of the equation that I started with. By the way, notice
just as a check-- if this were the
general solution, I should be able to find
unique values for c1 and c2 that allow me to
make the curve-- a solution curve pass through
x0, y0 with slope equal to z0. Notice that that would result
in this system of equations. And I hope that you notice
that since x0, y0, and z0 are constants, to determine
whether c1 and c2 are uniquely determined or
not hinges on the fact that the coefficient matrix,
the determinant of coefficients, is still the same
determinant of coefficients that I had in the
homogeneous case. By the way, let me just
stress one more point that I forgot to
mention over here. When I wrote down
this equation, notice that for the fundamental
theorem to be true, all that we required was that p,
q, and little f be continuous. Notice that p in this
problem is minus 4, q is 3, and f is e to the 2x. Certainly, the function f of-- the function which is
identically minus 4, the function which
is identically 3, and the function
which is e to the 2x are all continuous functions. So what that told me
was that once I find one family of solutions,
I've found them all, so in the linear case,
there is a general solution. There are no singular solutions
in a problem of this type. There can't be any
[? mongrel ?] solutions, because this particular
equation meets the requirements of the crucial theorem. At any rate, the exercises
will take care of this in giving you drill. Let me simply
summarize what we've done today in terms of
linear differential equation. So summary is this. Let's suppose that
we're given a general-- second order is what
I've been dealing with, but it's true for
higher orders, too-- linear equation of the
form y double prime plus p y prime plus qy equals f
of x, where p and q are arbitrary functions of x. What I'm saying is to
find a general solution of this equation-- and by the crucial theorem,
this general solution will exist if f, p,
and q are continuous-- first of all what I do is I
find the general solution y sub h of the homogeneous
equation L of y equals 0. In other words, I simply
replace the right-hand side of this equation by 0 and
find the general solution of this equation. In terms of the theory,
what we're saying also is-- and this I'll say
this as an aside-- if I can find two solutions
of this equation which are not scalar multiples
of one another-- in other words, if I can
find two functions u1 and u2 such that L of u1
and L of u2 are 0-- but that u2 is not
a constant times u1, then that homogeneous
solution-- in other words, the solution of this equation-- will simply be c1u1 plus c2u2. See, that was all in your
combinations of these two. And the proof of
that is quite simple. Namely, what we want is what? Given the point x0,
y0, we want to be sure that we can determine c1 and
c2 such that a curve passes through x0, y0 with slope z0. That means I must
be able to do what? I must be able to solve
this pair of equations. Namely, if I replace
x by x0 and y by y0, and if I differentiate
the equation and then replace
x by x0 and dy dx by z0, because that's the
derivative, the slope I want at this point, I
again wind up with what? Two equations with two unknowns. Remember, once I
specify x0, y0, and z0-- these are given constants, c1
and c2 are my only variables-- to see whether there is
a unique solution here, it means that this
determinant of coefficients must be unequal to 0. The determinant of coefficients
is just u1 u2 prime minus u1 prime u2. That must be unequal to 0. This suggests the
quotient rule again. This expression
to be unequal to 0 is the same as this divided by
u1 squared to be unequal to 0. This expression here is nothing
more than a derivative of u2 divided by u1 with respect to x. And for that to be unequal
to 0 simply says what? That u2/u1 is not a constant. And that says that u2 is
not a constant times u1. In other words, this
is just an aside to show you how you find
the general solution of the homogeneous equation. But the point is what? First of all, you find
the general solution of the homogeneous equation,
meaning you take L of y equals f of x,
replace f of x by 0, and find the general
solution of L of y equals 0. After you've done that,
you find any old solution, y sub p, of the given
equation L of y equals f as x. Just one solution,
by hook or by crook. Then finally, when you
have the general solution of the reduced-- the
homogeneous equation and the particular
solution of this equation, the general solution
of this equation will just be the
sum of these two. All right? And that's what
we're going to be talking about in the
next few lessons, see. What we're going to drill
on is what this stuff means. And I think you can now guess
what the next lectures are going to be about. After all, since to find
the general solution of the linear
differential equation I need the general solution
of the homogeneous equation and a particular solution of
the original equation, the two separate topics I
now have to tackle are, one, how do I find the
general solution of the reduced equation; and two, how do I
find a particular solution of the original equation? That, by the way, comes under
the heading of [? cookbook ?] again. That's drill. But this is the
underlying theory, the underlying philosophy. We will talk more about how
to handle the techniques in our subsequent lectures. At any rate, then, until
next time, goodbye. Funding for the
publication of this video was provided by the Gabriella
and Paul Rosenbaum Foundation. Help OCW continue to provide
free and open access to MIT courses by making a donation
at ocw.mit.edu/donate. S