Welcome to another Mathologer video. People are
always going on about how hard and complicated and how terribly high-level calculus
is. That’s actually only partially true. When you look at it in just the right
way, the core of calculus is actually very simple and straightforward, really
not much harder than basic algebra. In fact, when I was only 13 or 14 years old I
got introduced to calculus via the book “Calculus made easy” by Silvanus P. Thompson, a book which
is all about showing how easy calculus can be. It’s an amazing book and really worked
for me. And I am not the only one. Published in 1910 this book went
viral pretty much straightaway, is still in print after more than a century,
and has been sold over a million times. “Terrifying names” not something you’d expect
to read on the cover of a calculus book. It’s an unusual book in many ways and
the way calculus is explained in this book is also quite different from the
way it is explained in regular textbooks both then and now. I’ll put a link to
an online version in the description.
Well, I’ve been explaining calculus
myself for the better part of my life and what I’d like to do today is to give you my
own version of calculus made easy in this video. Of course, there are lots of calculus
videos out there. But as usual the aim of a Mathologer video on
something as done to death as calculus is to present a fresh and optimal take
that both novices and experts can enjoy. Okay, so here is what I’ve got planned for you.
In the first part of the video I’ll show you that your car is actually a calculus
machine and I’ll show you how you can perform those two terrifyingly named methods of
calculus, differential and integral calculus, by simply driving around and repurposing
the speedometer and odometer of your car. In the second part of the video we’ll find
out that the core of differential calculus is such a no-brainer that you can teach
it to a monkey. Pretty miraculous, really. On the other hand, the core of integral
calculus is far from being monkey see monkey do. Still, there is lots of easy
and powerful stuff there, too, which we get by putting differential calculus in
reverse. We’ll do that in part three of the video. And I’ll finish off with something I’ve been
meaning to put together for a long long time, a five minute animation in
which I derive all the most important formulas of calculus,
along the same lines as it is done in that amazing book that I just mentioned.
Five minutes for everything important! This miracle is made possible by an
ingenious way to capture calculus in symbols, courtesy of the great Gottfried Wilhelm
Leibnitz, one of the inventors of calculus. That’s all the that dy/dx stuff,
infinitesimals, but on steroids. This video is by special request
from my kids Lara and Karl and is therefore also dedicated to them. Are
you ready for a wild ride? Okay, then let’s go. Hmm, wild ride. Well, let’s imagine that we are cruising along in the Mathologer
mobile on the autobahn in Germany. Let’s have a look at the speedometer. Speed going up, speed going down and
speed going through the roof. But, remember, no speed limit on
the German Autobahn and so, no, 260 km/h is not a problem :) Okay, at some
point let’s start plotting speed vs time. There speed going up, down and through the roof.
Now, how can we translate this speed-time diagram into a distance time diagram which records how
far we have travelled since we started recording? A very natural thing to ask, right? Basically, what we are asking is this: how can we translate
the speedometer reading into the odometer reading. Obviously, this translation is hardwired into our
car, but how does this translation actually work? If you've never seen this before, I
am not going to blame you if you say, “No idea”. Well, to get some feel for what's
going on, let's look at the simplest case, where we are cruising along at a constant speed v. Now constant speed corresponds to a horizontal
line at height v in our speed/time diagram. There, constant speed. In the case of this
simple speed-time diagram, we actually know what the corresponding
distance-time diagram is. It’s given by the kindergarten formula
distance equals speed times time. And so the distance-time diagram we
are interested in looks like this, with v being the slope of the blue line. Easy peasy. And so the distance travelled at
time t is just the length of this yellow segment. On the other hand, and this is a crucial insight, our distance is also equal to the
“area” of this orange rectangle. Wait what? Well, the rectangle is v high and t wide and so its area is v.t
which is equal to the distance. So, what’s the answer to our original question?
How do you extract the distance travelled above from the speed/time diagram below? Well, as we’ve seen, in the case
of constant speed the answer is: The distance travelled is
simply the area under the curve. Nice. And, actually, it turns out that this is
true in general. Take any speed/time diagram, then the distance travelled is
just the area under the curve. Well, what’s the answer in the simple
straight line case that we just considered? As we already said, here the speed at
time t is simply the slope of the line. Which of course is the same for all t. What about in general? What’s the speed at time t? Well, unlike a line, a general curve
like this does not have just one slope. Its slope is different at different times.
In fact, at time t the slope of the curve is equal to the slope of the tangent line, the
line that touches the curve at this point. And, of course, the slope of the
tangent is different at different points Anyway, to go from top to bottom we
simply calculate the slope. Also neat. To summarise, the diagram at the
top records the odometer reading of my car with the odometer set
to zero when we start recording. On the other hand, the diagram at the bottom
records the speedometer reading over time. Also the Mathologer mobile is a vintage sports
car with a mechanical odometer which winds backwards when the car moves in reverse.
And, if I just move backwards beyond 0, the place where I started recording, I’ll
record distance as negative distance. There … Also, when the Mathologer mobile moves in reverse, speed is recorded as a negative
number. There something like this. With these recording conventions
locked in, I can at least in theory create any shape function at the bottom or at the
top by suitably moving my car. Neat, right? And by doing so I can perform some really impressive
mathematical feats. To give you an example of one such feat, I could move my car in such a way
that the function at the bottom is t squared. There the red curve that’s t squared and the
blue curve records the corresponding distance. But then I can calculate this area here under t
squared, by simply stopping the car at time t. Then the odometer reading
at this point is the area. How amazing is that?
Calculating areas with your car. The first precise calculation of this
pretty complicated area under t squared is essentially due to Archimedes and at that
point in history amounted to a big discovery. But with our set-up we can do much more than just
calculate the area under one curve. In theory this applies to any curve whatsoever. Very powerful
:) So the things we are playing around with here appear to be useful beyond flipping
between speed and distance travelled. Here is another example of a nice
application of our game. Let’s trace another curve, but this time one at the top. Now when I move my car so that the distance
travelled is what’s shown in this diagram the corresponding speed
diagram below looks like this. The top curve may actually have come from a
process for which it is important to identify where exactly the peak and the valley, the
maximum and the minimum in the diagram occur. Our setup can simplify this task.
Notice that the peak in the top diagram corresponds to a zero of the function below. Same with the valley. Why is that? Well, because the slopes of
the horizontal tangents at these special points are equal to zero. In other words,
my car won’t be moving at those times. Translating functions at the top into those
at the bottom and using this translation to do things like finding the peaks and
valleys of the top curves by simply finding the zeros of the bottom curves,
that’s called differential calculus, the first terrifying thing that Silvanus P.
Thompson mentioned on the cover of his book. And our kindergarten formula up there nicely captures how exactly we translate
between the top and the bottom, between the distance and the speed. Right?
Distance = speed times time, one thing TIMES another that’s area, our distance is the area
under our speed curve ! On the other hand, Speed = distance divided by t, one thing DIVIDED by another that’s slope, our
speed is the slope of our distance graph. Again Distance is the area under the speed graph and Speed is the slope of the distance graph. Top to bottom: slope. Bottom to top: area. Top to
bottom: slope. Bottom to top: area. Burn this into your memory. This amazing relationship goes by the
grand name of The Fundamental Theorem of Calculus. Fundamental as in “The most
important thing in calculus”, “the soul of calculus”. And it really
is. If you’ve followed everything so far you are now entitled to say “I
know calculus”. Well, sort of :) Anyway, none of what I said so far
was terribly terrifying, was it? But of course there is a bit more to calculus. In
particular, for all of this to be really useful, we need to be able to actually
perform those translations, preferably without having to worry about speed
limits, traffic lights and idiot drivers. Right? If the function on top is sine what’s the one
at the bottom? How can we figure that out? Well, it turns out that differential
calculus, to go from top to bottom, is really super easy and pretty
for a vast assortment of functions. This includes all our favourite functions
like the powers, the exponential functions, the trigonometric functions, and
so on. Let’s have a close look. Okay, let’s get away from cars and label
the axes at the top and bottom x and y. So all the functions we’ll be talking about are in
the variable x as it’s usually the case in school. Second, starting with a function f at the top the corresponding function at the bottom
is called the derivative of f, or f prime. As well, the process of translating the
function on top into the one at the bottom is called differentiating. Remember, differential calculus? Now let’s start by making the
function f one of our favourite functions. There, f might be a constant function or x to
some power or one of the trigonometric functions, and so on. What are the derivatives
of these functions? Well, here you go. There, the derivative of a constant function,
that is, its slope, is 0, obviously. The derivative of some power of x is
basically x to one less that power. Neat. For example, for n=5, we get this. There, the derivative of x to the power of 5
is 5 times x to the power of 4. 5, 4, one less. The derivative of sine turns out to be cosine. And the derivative of cosine
is minus sine. Very pretty. Here is something quite surprising.
The complicated natural logarithm function has the simple 1/x as its derivative. The exponential function is its own derivative.
Also super nifty. As I said, we’ll actually derive all these derivatives as part of the animation
at the end of the video. Important observation here is that essentially ALL functions in our list
have derivatives that are also part of our list, with a constant factor thrown
in the mix perhaps. Right, we mostly see the same stuff at
the top and at the bottom. Great. Next, starting with the functions in our list we can build lots and lots and lots of
more complicated functions by adding, subtracting, multiplying and dividing, AND
substituting one function into another. Of course in calculus there are many other
important more complicated ways to make up new functions from old ones. For example,
by constructing the inverses of functions. But let’s not worry about any of those other
ways for the moment and stick with the basics. To make new functions from old
ones for starters we’ll only add, subtract, multiply, divide and substitute. Cool. Here is an example, Those basic atom functions up there, plus the
complicated functions that can be constructed like this from them are called the elementary
functions. Yeah, I know, that thing over that doesn’t look very elementary but it’s really
elementary in the sense that it’s built from our atom functions using only those five
elementary ways of combining functions. Now, one of the main reasons why calculus
is so incredible beautiful and useful is that the derivative of every elementary
function is also an elementary function PLUS, and that’s really the killer, it’s not hard at
all to find these derivatives, not harder than basic algebra. You can teach a monkey to find the
derivative of an elementary function. Why is that? Well, let’s say you multiply two functions. I’ll show you in the animation at the end
that the derivative of this product is this This looks a bit noisy, so
let’s strip out the x s. Much nicer, right? Anyway, the two functions
f and g and their derivatives f prime and g prime are combined using two of our
basic operations, plus and times. And, so, if these four
functions are all elementary, then the “summy producty” combination of these
four elementary functions on the right side is also elementary. Right? Again, if f and
g and their derivatives are elementary, then the derivative of the product
f times g is also elementary. The same is true for the derivatives of
the sum, the difference and quotient of two function and for the derivative of the
substitution of one function into the other. Here are the corresponding formulas or rules. There plus, minus, times, divided and one function
subbed into another. Okay, now let me show you how all this translates into any elementary function
having an elementary derivative and how to find this derivative. For this let’s first make up
another elementary function from these four atoms. First we multiply the last two functions together then we add the first two and finally we divide the function on
the left by the function on the right. So, we used three operations to make this
elementary function: first a multiplication, then an addition, then a division. Now we want to find the
derivative of this function. For this we’ll use the rules that correspond to these three operations in reverse
order. So first the quotient rule, then the sum rule and finally the product rule. And as we are doing this, we’ll also feed in the
derivatives of the four atoms we started with whenever we come across them. And
once the last atom has been processed, we’ll be finished. Really completely automatic.
Let me show you an example. Pour yourself a cup of chocolate and enjoy the algebra
autopilot and the funky music :) Can you see how this works in general? The
output is definitely complicated, but, and that’s important, you really just just have to follow
your nose to get to it. Completely automatic. So, we start with an elementary
function and by applying our rules, which only involve elementary operations, we generated a sequence of elementary expressions
culminating in the elementary derivative. As I said you can teach a monkey
to do differential calculus. Remember what all this is good for?
Differentiating distance as a function of time gives speed as a function of time.
Differentiate again you get acceleration. If you are faced with a ferocious function in
the wild, you can reduce the task of finding the maxima and minima of the function to finding
the zeros of the derivative, and much, much more. Very useful and very powerful stuff. Okay, so once I’ve actually proved to
you that the derivatives of our atoms and those rules for finding derivatives are what
I claimed they are, then Differential calculus, to go from the top to the bottom,
looks pretty much under control. How about integral calculus,
going from bottom to top? All that cute area finding business? Well, using
our fundamental theorem of calculus you partly get this for free. What? How? Well, let’s say
the function at the bottom is really x squared. What’s the function at the top? Well, easy.
We just have to find the anti-derivative of x squared, that is, the function whose
derivative is x squared. For that let’s have a look at the list of derivatives of our
basic building blocks. Maybe we get lucky. We’ve been reading this list from top to
bottom. But, of course we can also read it from bottom to top, right? So is there an
x squared among the entries at the bottom? Well, yes, right there If we choose n=3 we get an x squared in the red. 3 x squared, almost what we want.
Well, to get x squared just divide by 3 both at the top and at the bottom
(yes, don’t worry, we can do that :) Okay, so the anti-derivative
of x squared is 1/3 x cubed. And so, if, for example, we are then
interested in the area under x squared between 0 and 1 , that area is
just 1/3 x^3 evaluated at 1, so 1/3 times 1 cubed, that’s just 1/3. In other words
this area is just 1/3 the area of this 1x1 square Super pretty and also pretty surprising
the first time you see this. Why would such a complicated area have
such a simple value of an area? Anyway, important insight,
reading from the bottom up our table of derivatives immediately gives us the
anti-derivatives of many key functions for free. That’s really nice, isn’t it. And super useful. But did you notice one or two little bumps
in the road? No? These are really minor bumps but at the same time very important
ones to smooth out. So let’s do that now. What’s bump number 1? Well let’s
look at the first entry of our list. What’s bumpy about that? Can you see
it? Well, the derivative of any constant function is 0. But this means that EVERY
constant function is an anti-derivative of the 0 constant function. 0 does not just have
one anti-derivative, it has infinitely many! In fact, just like the 0 constant function
has these infinitely many anti-derivatives so does every function. That’s actually
pretty obvious when you think about it. There, the blue function is an anti-derivative
of the red function. Again, what this means is that for all x values the slope at the
top is equal to the value at the bottom. But if the blue function has this property,
then so does every vertical translate of the blue function. Obvious, right? All of these
guys are anti-derivatives of the red function. They all share the same slopes at the same points. Again, the blue function is an anti-derivative of the red function. But so
is every vertical translate. Let’s also quickly check the algebra. For example, this entry says that the
derivative of sine is cosine. And translating up or down means adding
some constant to sine. So, can you see that what we see in front of us stays
true if we add a constant to sine? Obvious, right? Just unleash the sum rule which says that the derivative of a sum
is the sum of the derivatives. Tada, same derivative, nice :) Okay, that
was bump no 1, the fact that functions have infinitely many anti-derivatives and
that all these anti-derivatives are really all the same up to addition of constants,
just shifting up and down. Not a big deal. What about bump no. 2? Well, for the
anti-derivative area calculation up there to work, the top function has to be equal to 0 at x=0. Right? It’s got to be zero there. Why? Well,
if we move the right boundary of our area from 1 to 0, the area shrinks to 0 and so
the function on top should be 0 at 0. Okay. But now we have these other antiderivatives. How can we use one of these
antiderivatives to find the area? Not hard. We know that the area is
the length of the yellow segment. But that length is easy to calculate by evaluating
our new antiderivative at the left and right boundaries of our area. Right? The yellow length
is simply our anti-derivative evaluated at 1 minus the anti-derivative evaluated at 0. In fact, it’s easy to see that this even
works if the left boundary is not 0. There the area between 3/4 and 1 is simply the anti-derivative evaluated at 1 minus
the anti-derivative evaluated at 3/4. And that works for all anti-derivatives
and so also for the one we started with. And so the area here is this difference. Which happens to be … 37 over 192. And if your life
depends on figuring out this area, you’ll be super happy at this point :) Great! Anyway, reading our table from the bottom
up gives us the anti-derivatives of some important functions for free. Now, in theory,
we could get much much more by extending our table into a monster table that features all
infinitely many elementary functions at the top. And since all elementary functions
have elementary derivatives, the corresponding entries at the bottom
would also be elementary functions. And, remember, we are talking about infinitely
many entries at the top and so, who knows, maybe we actually also get ALL
elementary functions at the bottom. That would be great. Because if you then
challenge me to find the anti-derivative of some fiendishly difficult elementary function f f for fiendish :)
Then I could look up f at the bottom of my list and find its
anti-derivative right above. Easy, right? Well, there are a few “tiny”
problems with this approach. First, it turns out that there are actually
elementary functions that do not show up at the bottom of our list of derivatives. Like
that super famous function over there e^(-x^2), the function at the heart
of the normal distribution. Why not? f is just -x^2 substituted into the
exponential function. Basically one of our atoms substituted into another atom. Simple. No problem
differentiating this elementary function using our fifth rule, the so-called chain rule. Right,
a case for our monkey. However, there is no “elementary” counterpart of the chain rule when
it comes to finding antiderivatives. Bummer :) There are elementary counterparts
to the sum and difference rule, but not for the chain rule, the
product rule and the quotient rule. And it turns out that the absence of these
rules, translates into some elementary functions not showing up at the bottom of our list,
like our fiendish friend over there. In fact, given a randomly generated elementary
function it’s almost certain that it won’t appear at the bottom, that it won’t have an elementary
antiderivative. And there is another problem. Because of the absence of those three important
rules it is also not straightforward to determine which elementary functions do have
elementary antiderivatives and which don’t. For example, to prove that this super famous and
super important elementary function over there does not have an elementary anti-derivative
is crazy hard. And even that is not the end of our problems. Even if somebody tells you
that a certain elementary function has an elementary antiderivative it is usually not
straightforward to find this antiderivative. In any case, using that table up there in reverse
is incredibly useful and powerful in itself and while those problems we just stumbled
across are real problems there are also lots of tricks to overcome and work around
them. But those are topics for other videos. Okay, all in all, the elementary core of calculus
is this list of derivatives up there plus the rules for finding derivatives over
there. As a challenge for those of you who know a bit more, see what adjustments have
to be made to what I said so far if we allow taking inverses of functions as a sixth
operation to make new from old functions. Anyway, there is one more aspect to calculus
that really makes it unusually user friendly and that aspect is notation, the nifty way
in which we express calculus in symbols. This miracle notation was introduced by Gottfried
Willhelm Leibniz and was further streamlined over the years. This notation allows us to consider
the core of calculus as a simple extension of school algebra. And, in many ways, the incredible
success of that 100 year old calculus book lies in the way it uses Leibnitz notation to derive
the rules of calculus and to perform calculus. And so to finish off let me quickly introduce
the most basic elements of Leibniz notation and demonstrate its power by replicating and adding
some nice twists to what’s done in the book: Derive everything here from scratch in 5 minutes. Here we go. Calculating the
derivative of a function at some point means calculating the slope
of this touching line line. At first glance, it is not clear how we can
calculate this slope by looking at our function. BUT what’s easy to calculate is the slope of
a line that cuts the graph in a second point. And now, as you move the second point
towards the first one like this… …the slope of the line we are looking at, approaches the slope of the
touching line better and better. In this way we sneak up on the value of
the slope that we are really interested in. As usual, we calculate the slope as rise over
run. What’s the run? Well, some x increment. And the rise? Some f increment As we push the x increment to 0,
the f increment also goes to 0. The limiting slope is what we are
after and we write it as df/dx. Now, in the first instance that df/dx is just
an abbreviation for the limiting process I just described and by themselves the df at the top and
the dx at the bottom don’t appear to have lives of their own. However, the limiting process turns
out to be such that to some extent we can actually calculate with these d-increments very much
as with other numbers and algebraic variables. And, by doing so, we can easily
get all our derivative rules. It’s natural to start with the simple
sum rule but I think it’s more fun and more impressive to take care of the more
complicated product rule straightaway. Okay, what’s that derivative? Well, in
terms of these weird d-increments it’s this. And what is the increment dee f times g on top? Well starting with the product fg as we
increment x by dx, f will increment by df And g will increment by dg And so the orange increment is just the
difference between these two products Okay, now it’s just a matter of algebra autopilot.
Expand the product, and so on. Watch and wonder. Now remember in the limiting
process dg actually stands for the g increment going to 0 and so we can
finish our calculation like this. Tada, I present to you the product
rule. Very nice isn’t it :) And now, as promised, I’ll animate derivations of all the other rules of differential calculus and
the derivatives of our atom functions for you. Following this I end with some quick snapshots of
a couple of other instances of Leibnitz notation working miracles that many of you will be
familiar with, but that I won’t get around to covering today. These snapshots also feature
that second main ingredient of Leibniz notation the integral sign, that elongated S, Leibniz’s
way to denote the anti-derivative. Enjoy. The derivative of one function
subbed into another. What’s that? As the variable x changes by dx the function g changes by dg. And dg is equal to this Right? Now as g changes by dg, the
function f changes by df And df? Well, on the one hand df is this Obvious, again right, dg cancels. On the other
hand, df is really the total change we are after. Autopilot. Very pretty isn’t it? Let’s chuck the primes in to put this formula in the
shape I showed you earlier.