Earlier in this modern physics series, we
touched upon some concepts in quantum mechanics in an introductory and strictly qualitative
manner. For most people, this would be more than sufficient, because quantum mechanics
is very difficult to understand, as it hinges entirely on the ability to apply fairly advanced
principles of mathematics. But for those of us who wish to attain a more sophisticated
understanding of this area of physics, we have no choice but to dive into the math,
because quantum mechanics is math. Anyone who says otherwise either doesn’t understand
it, or is trying to sell you something, often both. This rigorously mathematical approach
will now be possible, after having covered linear algebra at great length in my mathematics
series, as well as some concepts regarding differential equations. If you’re up to
speed with these areas of math, and you want to upgrade your understanding of quantum mechanics,
the next handful of tutorials are going to be right up your alley. Together we will derive
the equations of quantum mechanics and try to understand what they tell us about quantum
systems, and by extension, reality. If you are not up to speed on these areas
of math, you have three options. The first option is to visit my mathematics playlist,
scroll down and read the titles until you identify the limit of your current mathematical
comprehension, and then proceed to view the playlist from that point forward in order,
making sure to get through the topics we mentioned. The second option, if that sounds far too
daunting, is to forego the mathematical prerequisites and watch these tutorials anyway. Rest assured,
they will look and sound like complete gibberish, but you may still get something out of it,
and I can’t tell you how to live your life. Finally, the third option is to simply ignore
these tutorials. Not everyone has to understand quantum mechanics, and in truth, only a minuscule
percentage of people really do. It is far more important for non-physicists to understand
Newtonian mechanics, electricity and magnetism, or other subfields of physics that are more
readily applicable to our everyday experience. There are many tutorials in my classical physics
series that cover these topics, and I highly recommend them if you are interested in learning
aspects of physics that are a bit more tangible and comprehensible. However, if you find yourself
overwhelmed with curiosity regarding the quantum world, limiting ourselves to only a qualitative
discussion does a deep disservice to the field, so these next few tutorials will be here for
anyone that wants to take it upon themselves to do the heavy lifting, and really dig into
the math. The end result will be a healthy deconstruction of a field of science that
has been overloaded with mysticism by popular media, which never bodes well for public science
literacy. So if you’ve decided you’re on board the quantum train, let’s get started.
To begin, let’s quickly reiterate a key distinction that must be made when considering
classical systems versus quantum systems. When we model the behavior of classical systems
of particles, we usually ask questions like: Where is some particle located? Where will
it be some specific time from now? How fast is it going? These are questions we asked
and answered in the classical physics series, when examining Newtonian mechanics as it applies
to macroscopic objects and events, such as when throwing a ball. In such cases, we would
define the position of the object with the variable, x. After learning the basics of
differential calculus, velocity can be regarded as the derivative of position with respect
to time, or v equals dx over dt, and acceleration as the derivative of velocity with respect
to time, or a equals dv over dt. We can discuss the momentum of that object using p = mv,
where momentum equals mass times velocity. From here, Newton’s famous equation, F = ma,
describes the acceleration exhibited by an object as a function of the force applied
upon it, which will be inversely proportional to its mass. This equation could be used to
predict the precise position and velocity of a classical particle at any time, provided
that we know the initial conditions. This relationship between force and acceleration
is so powerful, that we still use it in lots of complicated calculations today. And if
we want to know any of these dynamical variables for a classical particle or system, we can
take our cameras, or whatever apparatus we are using to take measurements, and rigidly
determine both the position, x, and momentum, p, simultaneously, such that we can write
their values down and use them to do calculations. And there it is. That’s classical mechanics
in a nutshell. However, as we touched upon in previous tutorials,
quantum particles, or the teeny tiny particles that are smaller than an atom, operate under
a different set of rules. We don’t have “quantum cameras” or any other such measuring
device that can examine a quantum particle such as an electron and give us a number for
both x and p simultaneously. Instead, all measurements must satisfy the Heisenberg uncertainty
principle, which in the case of the complementary variables position and momentum, is written
as delta x times delta p is greater than or equal to h bar over two, where h bar is the
reduced Planck constant, equal to h over two pi, making this term as a whole equal to h
over four pi. What this means is that we can’t know both the position and momentum of such
a particle simultaneously with supreme precision. There must be some uncertainty associated
with one or both parameters, and the more certain one is, the more uncertain the other
becomes. So why is this the case, why does such a limit exist? Why is this the way we
must approach the description of quantum systems? The way we approached this question when we
were coming at it from a conceptual standpoint was to talk about wave-particle duality. A
quantum particle like an electron is not just a particle. It is also a wave. If it were
to possess both a discrete position and momentum simultaneously, it could be regarded exclusively
as a particle. But it is not just a particle, so it does not possess precise values for
these parameters at once. It simply is not in its nature. But we have gone over this
before. Now to enhance our understanding, what is the reasoning from a purely mathematical
standpoint? Well to start, in quantum mechanics, position
and momentum are not just numbers, they are linear operators. Operators are mathematical
objects that act upon functions, and result in the production of other functions as an
output, similar to the way that functions act upon values to produce other values. Operators
must satisfy the following two properties. First, operator A acting on the product of
the constant a and the function f of x equals the constant a times the operator A acting
on the function f of x. Second, operator A acting on the sum of functions f of x and
g of x equals operator A acting on f of x plus operator A acting on g of x. We learned
about these relationships in the linear algebra portion of the mathematics series, where we
saw that linear operators can act on matrices to produce other matrices. This is exactly
what we are dealing with here. But why do we use operators when we deal with
quantum particles and why don’t we use them when we deal with classical particles? The
reason is that classical particles are macroscopic objects, meaning that they are much larger
than an atom. When you do the math for a classical object in motion, its properties, such as
position and momentum, have well-defined values that you can measure. We can say that some
particle is at this position, and it has this momentum. As we mentioned, quantum particles
don’t behave like this. They are in several places at the same time, at all times, and
we need to represent this mathematically somehow. That’s why we have the wavefunction, which
is a mathematical description of an isolated quantum system, given by the Greek letter
psi. This gives us an idea of the distributed presence of a quantum particle. It describes
the state of the particle as a superposition of all possible states. In other words, it
is the mathematical description of the physical reality of the particle being in several places
at the same time, before measurement, and its absolute value squared represents a probability
distribution function. To be as formal as possible, the probability distribution function
P of x equals the product psi conjugate times psi, which equals the modulus squared of psi
of x, where the term modulus refers to the square root of a complex number Z times its
complex conjugate, Z star. In case we are rusty with our complex numbers, recall that
they always have a real part and an imaginary part, the latter of which includes i, which
represents the square root of negative one. And if we are dealing with a complex wavefunction,
which normally describe oscillating systems, it is difficult to interpret psi of x as a
real or physical quantity. That is why we utilize the complex conjugate, which is the
same expression but with the sign of the imaginary part reversed. When we multiply a complex
number by its complex conjugate, we always get a real result, because we end up with
an i squared term which simplifies to negative one. The same concept applies here, if we
take psi conjugate times psi, we necessarily get a real number, which can tell us something
about physical reality. So essentially, the squared magnitude of the
wavefunction is the probability density function that helps us find where we have a chance
of measuring the particle, which is the kind of math that was done to determine the shapes
of the atomic orbitals that electrons inhabit within an atom, which we learned about in
the general chemistry series. Getting back to the question at hand, because
we have to represent quantum objects with the wavefunction, we need operators that can
act on the wavefunction to retrieve information encoded within, and can provide answers in
the form of measurements. Later we will expand considerably on what a wavefunction is. For
now let’s talk about how the actions of operators on wavefunctions can give us information
about quantum objects. Mathematically, if we have a function psi
of x, when an operator A acts upon it, we get another function, phi of x. Again, this
is precisely the way that when a function f acts on a quantity x we get another quantity
y, except that operators act on functions. Here are some examples of operators to help
make this concept perfectly clear. For this first one, operator one on psi equals the
partial derivative of psi with respect to time. Next, operator two on psi equals x times
the partial derivative of psi with respect to x. And lastly, operator three on psi equals
alpha times psi, where alpha is a constant. Now let’s try applying these operators to
a specific function, such as the following wavefunction, psi of x and t equals the complex
exponential of kx minus omega t. If you are familiar with the physics of oscillations,
we usually refer to this type of wave as a plane wave, because the wavefront, which refers
to the spatial distribution of, for instance, light, is flat. In this equation, k is the
wave-vector two pi over lambda, where lambda is wavelength. The wave-vector describes the
spatial frequency of the wave. In other words, it tells us how many peaks and valleys the
wave has over a certain length. Then, X is the position. And omega is the angular frequency,
equal to two pi nu, where nu is the frequency of the wave. Now let’s apply the three operators
we listed previously. The first operator takes the partial derivative of psi with respect
to time. If this seems daunting, you may need to refresh your memory regarding partial derivatives,
as well as the chain rule for differentiation, and in particular how it applies to functions
resembling e to the x. But if this is familiar to you, then recall that d over dt means that
t is the variable. The other parameters will be treated as constants, and realizing that
i distributes across this parenthetical, it is negative i times omega that will be pulled
down in front of the term, and since the term equivalent to psi is unchanged, we can report
this as negative i omega psi. The second operator is x times the partial derivative of psi with
respect to x. This time it is i k that will be pulled down in front of the term when differentiating,
leaving us with i k x psi. And finally, the third operator just multiplies psi by the
constant alpha, so we end up with alpha psi. So now we know how to apply operators to functions.
Now, as we said, in quantum mechanics, position and momentum are operators. The position operator
in one dimension can be written thusly, as the position operator x times psi of x equals
x times psi of x, where x is the eigenvalue of the operator x acting on psi of x. We talked
about eigenvalues and eigenvectors at great length in one of the linear algebra tutorials,
and it is quite important to understand what these are, so visit that tutorial if you need
a refresher on this terminology. Also be sure to notice the difference in notation between
the operator x and the variable x, as these mean different things.
The momentum operator is a bit more complicated. Here, the momentum operator p acting on psi
of x equals negative i times h bar times the partial derivative of psi of x with respect
to x. We use a partial derivative here for two reasons. First, the wavefunction, as we
will see later, also depends on time, which means we should actually write it as psi of
x and t. The second reason is that it is defined in three dimensions, where the complete form
could be written as psi of x, y, z, and t. Therefore, if we want to study the dynamics
of a quantum particle in three dimensions, we’ll need these three operators. These
are the momentum operator in x, in y, and in z, each containing a partial derivative
with respect to that particular variable. These equations display the orthogonal components
of a three-dimensional system. However, for the time being, we will focus only on one-dimensional
problems. Now, we easily identified the eigenvalue of
the position operator, but it’s not so obvious in these expressions for the momentum operator.
To find an intuitive answer, we can apply what is referred to as the de Broglie principle,
and write our quantum object in its waveform, psi equals this expression, where again k
is the wave-vector, omega is frequency, and t is time. If we now apply the momentum operator,
we see that negative i h bar goes out front, and then we take the partial derivative of
psi with respect to x. Since we have ikx in the exponent, ik must come down in front of
the complex exponential, and the rest remains the same. The two versions of i with opposing
sign cancel one another out because negative i squared equals one, and we can express this
section simply as psi again, which means we get h bar times k times psi. Therefore, we
see that the eigenvalue of the momentum operator for the wavefunction we chose is h bar times k. This is actually another way of expressing
the de Broglie relation, and it will be worth our while to derive this result starting from
here as well. The relation states that lamba equals h over p, where lambda is wavelength,
h is the Planck constant, and p is momentum. We can rearrange to get p equals h over lambda.
Then let’s take the definition of the wave-vector, k equals two pi over lambda, and rearrange
to solve for lambda, which is two pi over k. Plugging this in for lambda in the other
expression, we get p equals h over the quantity two pi over k, and simplifying gives us hk
over two pi. We know that h over two pi can be expressed as h bar, so we are left with
p equals h bar times k, just as we found through the previous method.
Now let’s pause for a moment. We will come back to these two important operators a bit
later. First, it will be a good idea to define some important operator properties, such that
we can apply them to operators found in quantum mechanics. Some basic rules to know when dealing
with operators are as follows. One. Operators can be defined as the sum of
other operators. For example, S equals A plus B. What this means is that the action of S
on psi is equal to the action of A on psi plus the action of B on psi.
Two. If an operator contains constants or complex numbers, these remain constants. So
with C, which equals beta times B, where beta is a complex number, beta remains a constant
when the operator acts on a function. So applying C to psi, we get beta times the action of
B on psi. Three. Operators that represent what we refer
to as observables are Hermitian, or self-adjoint. This is the fancy name for operators whose
eigenvalues are real numbers. Position and momentum are Hermitian operators, because
any measurement of these parameters must be a real number, and it makes sense intuitively
that we would refer to position and momentum as observables. Mathematically, we can easily
identify a Hermitian operator if it is equal to its own complex conjugate, a relationship
which is represented symbolically as A equals A dagger. We talked about Hermitian matrices
in a linear algebra tutorial, so head over there if this seems a bit fuzzy. Otherwise,
just recall that if an operator is in matrix form, the complex conjugate involves getting
the complex matrix, represented by a star, and then transposing it, represented with
a T, and these two operations combined are represented by the dagger symbol.
And finally, four. Operators generally do not commute. This means that AB is not necessarily
equal to BA. Therefore, the order of operators must be followed strictly. If you encounter
an operator M which equals AB, if this operator is to act on psi, one should operate such
that M acting on psi will be equal to A acting on the result of B acting on psi, so B must act first. Now let’s quickly apply these rules to understand
a couple more key points. First, let’s examine the notion of the power of an operator. Say
we square the position operator x, and allow that to act upon psi. If we apply the rule
we just learned, we get this parenthetical term, so first we get x times psi, and then
the x operator acts again to give us another x, and we are left with x squared psi. This
will be important later when we get a closer look at the Schrodinger equation.
And finally, we must mention a very important operator in quantum mechanics, the commutator.
We write the commutator of two operators A and B by putting them inside square brackets
like this, and when the commutator acts on operators A and B, it will give us A acting
on B minus B acting on A. Recall that since operators generally do not commute, AB is
not the same as BA, and therefore the commutator typically will not equal zero. It will only
equal zero in the special case that the operators A and B happen to commute in that particular
case. This operator can act on some function just like any other operator, which means
the action of the commutator on psi equals A acting on B acting on psi minus B acting
on A acting on psi. To make sure we understand, let’s produce
the commutator for the two operators we have discussed so far, position and momentum, using
position in place of A and momentum in place of B. Then let’s have this commutator act
on the only wavefunction we’ve used so far. As we would expect, the action of the commutator
of position and momentum acting on psi equals x acting on p acting on psi minus p acting
on x acting on psi. Now to make things a little easier on ourselves,
let’s first just recall what happens when p and x individually act on the wavefunction.
When x acts on psi, we get x times psi. When p acts on psi, we get negative i times h bar
times the partial derivative of psi with respect to x. Now we can apply these definitions to
the expression we have now, one term at a time. In the first term, x acting on p acting
on psi, the momentum operator must act first. Since that involves taking the partial derivative
with respect to x, and i and k act as constants on x, we bring those down in front. Then combining
that with the negative i and the h bar, these two multiply to give us one, leaving us with
h bar times k times psi. Now x must act upon this, which just involves multiplying by x,
so the whole first term of the commutator will be x h bar k psi.
Now the second term of the commutator, p acting on x acting on psi, is a little trickier,
so let’s compute that separately. As you can see, first x must act on psi. That gives
us x times psi. Now p must act on the quantity x times psi, so we have negative i times h
bar, times the partial derivative of x times psi with respect to x. This time we must use
the product rule for differentiation, which we are familiar with from our study of calculus,
given that each term in this product contains x. This will be the derivative of the first
term times the second term, plus the first term times the derivative of the second term.
The derivative of the first term, x, is simply one, so that leaves us with just the second
term, psi. Then we have the first term, x, times the derivative of the second term with
respect to x, which brings i and k down here to give i k psi, so that’s i k x psi all
together. Now let’s distribute negative i h bar across this sum to get negative i
h bar psi plus h bar k x psi. Now remember, this whole expression is the second term in
the commutator, which must be subtracted from the first, so let’s put it in parentheses
so that we don’t make any careless errors with sign. And we see that these two terms
cancel, leaving us with i h bar psi. So, what we are left with is the realization
that the commutator of position and momentum equals i times h bar. This is the fundamental
commutation relation between position and momentum. This relation, which is at the core
of quantum mechanics, is actually the source of the Heisenberg uncertainty principle. We
can see that position and momentum are conjugate operators, and they do not commute. Whenever
we encounter non-commuting operators, we will find a limit on the precision with which we
can simultaneously measure the physical quantities they represent, the implications of which
we discussed earlier in this modern physics series.
So with that, we understand the concept of an operator, some basic rules for applying
operators, and how to apply the position and momentum operators to the wavefunction. Now
it’s time to learn more about the wavefunction, so let’s move forward and do just that.