Earlier in this modern physics series, we
touched upon some concepts in quantum mechanics
in an introductory and strictly qualitative
manner. For most people, this would be more
than sufficient, because quantum mechanics
is very difficult to understand, as it hinges
entirely on the ability to apply fairly advanced
principles of mathematics. But for those of
us who wish to attain a more sophisticated
understanding of this area of physics, we
have no choice but to dive into the math,
because quantum mechanics is math. Anyone
who says otherwise either doesn’t understand
it, or is trying to sell you something, often
both. This rigorously mathematical approach
will now be possible, after having covered
linear algebra at great length in my mathematics
series, as well as some concepts regarding
differential equations. If you’re up to
speed with these areas of math, and you want
to upgrade your understanding of quantum mechanics,
the next handful of tutorials are going to
be right up your alley. Together we will derive
the equations of quantum mechanics and try
to understand what they tell us about quantum
systems, and by extension, reality.
If you are not up to speed on these areas
of math, you have three options. The first
option is to visit my mathematics playlist,
scroll down and read the titles until you
identify the limit of your current mathematical
comprehension, and then proceed to view the
playlist from that point forward in order,
making sure to get through the topics we mentioned.
The second option, if that sounds far too
daunting, is to forego the mathematical prerequisites
and watch these tutorials anyway. Rest assured,
they will look and sound like complete gibberish,
but you may still get something out of it,
and I can’t tell you how to live your life.
Finally, the third option is to simply ignore
these tutorials. Not everyone has to understand
quantum mechanics, and in truth, only a minuscule
percentage of people really do. It is far
more important for non-physicists to understand
Newtonian mechanics, electricity and magnetism,
or other subfields of physics that are more
readily applicable to our everyday experience.
There are many tutorials in my classical physics
series that cover these topics, and I highly
recommend them if you are interested in learning
aspects of physics that are a bit more tangible
and comprehensible. However, if you find yourself
overwhelmed with curiosity regarding the quantum
world, limiting ourselves to only a qualitative
discussion does a deep disservice to the field,
so these next few tutorials will be here for
anyone that wants to take it upon themselves
to do the heavy lifting, and really dig into
the math. The end result will be a healthy
deconstruction of a field of science that
has been overloaded with mysticism by popular
media, which never bodes well for public science
literacy. So if you’ve decided you’re
on board the quantum train, let’s get started.
To begin, let’s quickly reiterate a key
distinction that must be made when considering
classical systems versus quantum systems.
When we model the behavior of classical systems
of particles, we usually ask questions like:
Where is some particle located? Where will
it be some specific time from now? How fast
is it going? These are questions we asked
and answered in the classical physics series,
when examining Newtonian mechanics as it applies
to macroscopic objects and events, such as
when throwing a ball. In such cases, we would
define the position of the object with the
variable, x. After learning the basics of
differential calculus, velocity can be regarded
as the derivative of position with respect
to time, or v equals dx over dt, and acceleration
as the derivative of velocity with respect
to time, or a equals dv over dt. We can discuss
the momentum of that object using p = mv,
where momentum equals mass times velocity.
From here, Newton’s famous equation, F = ma,
describes the acceleration exhibited by an
object as a function of the force applied
upon it, which will be inversely proportional
to its mass. This equation could be used to
predict the precise position and velocity
of a classical particle at any time, provided
that we know the initial conditions. This
relationship between force and acceleration
is so powerful, that we still use it in lots
of complicated calculations today. And if
we want to know any of these dynamical variables
for a classical particle or system, we can
take our cameras, or whatever apparatus we
are using to take measurements, and rigidly
determine both the position, x, and momentum,
p, simultaneously, such that we can write
their values down and use them to do calculations.
And there it is. That’s classical mechanics
in a nutshell.
However, as we touched upon in previous tutorials,
quantum particles, or the teeny tiny particles
that are smaller than an atom, operate under
a different set of rules. We don’t have
“quantum cameras” or any other such measuring
device that can examine a quantum particle
such as an electron and give us a number for
both x and p simultaneously. Instead, all
measurements must satisfy the Heisenberg uncertainty
principle, which in the case of the complementary
variables position and momentum, is written
as delta x times delta p is greater than or
equal to h bar over two, where h bar is the
reduced Planck constant, equal to h over two
pi, making this term as a whole equal to h
over four pi. What this means is that we can’t
know both the position and momentum of such
a particle simultaneously with supreme precision.
There must be some uncertainty associated
with one or both parameters, and the more
certain one is, the more uncertain the other
becomes. So why is this the case, why does
such a limit exist? Why is this the way we
must approach the description of quantum systems?
The way we approached this question when we
were coming at it from a conceptual standpoint
was to talk about wave-particle duality. A
quantum particle like an electron is not just
a particle. It is also a wave. If it were
to possess both a discrete position and momentum
simultaneously, it could be regarded exclusively
as a particle. But it is not just a particle,
so it does not possess precise values for
these parameters at once. It simply is not
in its nature. But we have gone over this
before. Now to enhance our understanding,
what is the reasoning from a purely mathematical
standpoint?
Well to start, in quantum mechanics, position
and momentum are not just numbers, they are
linear operators. Operators are mathematical
objects that act upon functions, and result
in the production of other functions as an
output, similar to the way that functions
act upon values to produce other values. Operators
must satisfy the following two properties.
First, operator A acting on the product of
the constant a and the function f of x equals
the constant a times the operator A acting
on the function f of x. Second, operator A
acting on the sum of functions f of x and
g of x equals operator A acting on f of x
plus operator A acting on g of x. We learned
about these relationships in the linear algebra
portion of the mathematics series, where we
saw that linear operators can act on matrices
to produce other matrices. This is exactly
what we are dealing with here.
But why do we use operators when we deal with
quantum particles and why don’t we use them
when we deal with classical particles? The
reason is that classical particles are macroscopic
objects, meaning that they are much larger
than an atom. When you do the math for a classical
object in motion, its properties, such as
position and momentum, have well-defined values
that you can measure. We can say that some
particle is at this position, and it has this
momentum. As we mentioned, quantum particles
don’t behave like this. They are in several
places at the same time, at all times, and
we need to represent this mathematically somehow.
That’s why we have the wavefunction, which
is a mathematical description of an isolated
quantum system, given by the Greek letter
psi. This gives us an idea of the distributed
presence of a quantum particle. It describes
the state of the particle as a superposition
of all possible states. In other words, it
is the mathematical description of the physical
reality of the particle being in several places
at the same time, before measurement, and
its absolute value squared represents a probability
distribution function. To be as formal as
possible, the probability distribution function
P of x equals the product psi conjugate times
psi, which equals the modulus squared of psi
of x, where the term modulus refers to the
square root of a complex number Z times its
complex conjugate, Z star. In case we are
rusty with our complex numbers, recall that
they always have a real part and an imaginary
part, the latter of which includes i, which
represents the square root of negative one.
And if we are dealing with a complex wavefunction,
which normally describe oscillating systems,
it is difficult to interpret psi of x as a
real or physical quantity. That is why we
utilize the complex conjugate, which is the
same expression but with the sign of the imaginary
part reversed. When we multiply a complex
number by its complex conjugate, we always
get a real result, because we end up with
an i squared term which simplifies to negative
one. The same concept applies here, if we
take psi conjugate times psi, we necessarily
get a real number, which can tell us something
about physical reality.
So essentially, the squared magnitude of the
wavefunction is the probability density function
that helps us find where we have a chance
of measuring the particle, which is the kind
of math that was done to determine the shapes
of the atomic orbitals that electrons inhabit
within an atom, which we learned about in
the general chemistry series.
Getting back to the question at hand, because
we have to represent quantum objects with
the wavefunction, we need operators that can
act on the wavefunction to retrieve information
encoded within, and can provide answers in
the form of measurements. Later we will expand
considerably on what a wavefunction is. For
now let’s talk about how the actions of
operators on wavefunctions can give us information
about quantum objects.
Mathematically, if we have a function psi
of x, when an operator A acts upon it, we
get another function, phi of x. Again, this
is precisely the way that when a function
f acts on a quantity x we get another quantity
y, except that operators act on functions.
Here are some examples of operators to help
make this concept perfectly clear. For this
first one, operator one on psi equals the
partial derivative of psi with respect to
time. Next, operator two on psi equals x times
the partial derivative of psi with respect
to x. And lastly, operator three on psi equals
alpha times psi, where alpha is a constant.
Now let’s try applying these operators to
a specific function, such as the following
wavefunction, psi of x and t equals the complex
exponential of kx minus omega t. If you are
familiar with the physics of oscillations,
we usually refer to this type of wave as a
plane wave, because the wavefront, which refers
to the spatial distribution of, for instance,
light, is flat. In this equation, k is the
wave-vector two pi over lambda, where lambda
is wavelength. The wave-vector describes the
spatial frequency of the wave. In other words,
it tells us how many peaks and valleys the
wave has over a certain length. Then, X is
the position. And omega is the angular frequency,
equal to two pi nu, where nu is the frequency
of the wave. Now let’s apply the three operators
we listed previously. The first operator takes
the partial derivative of psi with respect
to time. If this seems daunting, you may need
to refresh your memory regarding partial derivatives,
as well as the chain rule for differentiation,
and in particular how it applies to functions
resembling e to the x. But if this is familiar
to you, then recall that d over dt means that
t is the variable. The other parameters will
be treated as constants, and realizing that
i distributes across this parenthetical, it
is negative i times omega that will be pulled
down in front of the term, and since the term
equivalent to psi is unchanged, we can report
this as negative i omega psi. The second operator
is x times the partial derivative of psi with
respect to x. This time it is i k that will
be pulled down in front of the term when differentiating,
leaving us with i k x psi. And finally, the
third operator just multiplies psi by the
constant alpha, so we end up with alpha psi.
So now we know how to apply operators to functions.
Now, as we said, in quantum mechanics, position
and momentum are operators. The position operator
in one dimension can be written thusly, as
the position operator x times psi of x equals
x times psi of x, where x is the eigenvalue
of the operator x acting on psi of x. We talked
about eigenvalues and eigenvectors at great
length in one of the linear algebra tutorials,
and it is quite important to understand what
these are, so visit that tutorial if you need
a refresher on this terminology. Also be sure
to notice the difference in notation between
the operator x and the variable x, as these
mean different things.
The momentum operator is a bit more complicated.
Here, the momentum operator p acting on psi
of x equals negative i times h bar times the
partial derivative of psi of x with respect
to x. We use a partial derivative here for
two reasons. First, the wavefunction, as we
will see later, also depends on time, which
means we should actually write it as psi of
x and t. The second reason is that it is defined
in three dimensions, where the complete form
could be written as psi of x, y, z, and t.
Therefore, if we want to study the dynamics
of a quantum particle in three dimensions,
we’ll need these three operators. These
are the momentum operator in x, in y, and
in z, each containing a partial derivative
with respect to that particular variable.
These equations display the orthogonal components
of a three-dimensional system. However, for
the time being, we will focus only on one-dimensional
problems.
Now, we easily identified the eigenvalue of
the position operator, but it’s not so obvious
in these expressions for the momentum operator.
To find an intuitive answer, we can apply
what is referred to as the de Broglie principle,
and write our quantum object in its waveform,
psi equals this expression, where again k
is the wave-vector, omega is frequency, and
t is time. If we now apply the momentum operator,
we see that negative i h bar goes out front,
and then we take the partial derivative of
psi with respect to x. Since we have ikx in
the exponent, ik must come down in front of
the complex exponential, and the rest remains
the same. The two versions of i with opposing
sign cancel one another out because negative
i squared equals one, and we can express this
section simply as psi again, which means we
get h bar times k times psi. Therefore, we
see that the eigenvalue of the momentum operator
for the wavefunction we chose is h bar times k.
This is actually another way of expressing
the de Broglie relation, and it will be worth
our while to derive this result starting from
here as well. The relation states that lamba
equals h over p, where lambda is wavelength,
h is the Planck constant, and p is momentum.
We can rearrange to get p equals h over lambda.
Then let’s take the definition of the wave-vector,
k equals two pi over lambda, and rearrange
to solve for lambda, which is two pi over
k. Plugging this in for lambda in the other
expression, we get p equals h over the quantity
two pi over k, and simplifying gives us hk
over two pi. We know that h over two pi can
be expressed as h bar, so we are left with
p equals h bar times k, just as we found through
the previous method.
Now let’s pause for a moment. We will come
back to these two important operators a bit
later. First, it will be a good idea to define
some important operator properties, such that
we can apply them to operators found in quantum
mechanics. Some basic rules to know when dealing
with operators are as follows.
One. Operators can be defined as the sum of
other operators. For example, S equals A plus
B. What this means is that the action of S
on psi is equal to the action of A on psi
plus the action of B on psi.
Two. If an operator contains constants or
complex numbers, these remain constants. So
with C, which equals beta times B, where beta
is a complex number, beta remains a constant
when the operator acts on a function. So applying
C to psi, we get beta times the action of
B on psi.
Three. Operators that represent what we refer
to as observables are Hermitian, or self-adjoint.
This is the fancy name for operators whose
eigenvalues are real numbers. Position and
momentum are Hermitian operators, because
any measurement of these parameters must be
a real number, and it makes sense intuitively
that we would refer to position and momentum
as observables. Mathematically, we can easily
identify a Hermitian operator if it is equal
to its own complex conjugate, a relationship
which is represented symbolically as A equals
A dagger. We talked about Hermitian matrices
in a linear algebra tutorial, so head over
there if this seems a bit fuzzy. Otherwise,
just recall that if an operator is in matrix
form, the complex conjugate involves getting
the complex matrix, represented by a star,
and then transposing it, represented with
a T, and these two operations combined are
represented by the dagger symbol.
And finally, four. Operators generally do
not commute. This means that AB is not necessarily
equal to BA. Therefore, the order of operators
must be followed strictly. If you encounter
an operator M which equals AB, if this operator
is to act on psi, one should operate such
that M acting on psi will be equal to A acting
on the result of B acting on psi, so B must act first.
Now let’s quickly apply these rules to understand
a couple more key points. First, let’s examine
the notion of the power of an operator. Say
we square the position operator x, and allow
that to act upon psi. If we apply the rule
we just learned, we get this parenthetical
term, so first we get x times psi, and then
the x operator acts again to give us another
x, and we are left with x squared psi. This
will be important later when we get a closer
look at the Schrodinger equation.
And finally, we must mention a very important
operator in quantum mechanics, the commutator.
We write the commutator of two operators A
and B by putting them inside square brackets
like this, and when the commutator acts on
operators A and B, it will give us A acting
on B minus B acting on A. Recall that since
operators generally do not commute, AB is
not the same as BA, and therefore the commutator
typically will not equal zero. It will only
equal zero in the special case that the operators
A and B happen to commute in that particular
case. This operator can act on some function
just like any other operator, which means
the action of the commutator on psi equals
A acting on B acting on psi minus B acting
on A acting on psi.
To make sure we understand, let’s produce
the commutator for the two operators we have
discussed so far, position and momentum, using
position in place of A and momentum in place
of B. Then let’s have this commutator act
on the only wavefunction we’ve used so far.
As we would expect, the action of the commutator
of position and momentum acting on psi equals
x acting on p acting on psi minus p acting
on x acting on psi.
Now to make things a little easier on ourselves,
let’s first just recall what happens when
p and x individually act on the wavefunction.
When x acts on psi, we get x times psi. When
p acts on psi, we get negative i times h bar
times the partial derivative of psi with respect
to x. Now we can apply these definitions to
the expression we have now, one term at a
time. In the first term, x acting on p acting
on psi, the momentum operator must act first.
Since that involves taking the partial derivative
with respect to x, and i and k act as constants
on x, we bring those down in front. Then combining
that with the negative i and the h bar, these
two multiply to give us one, leaving us with
h bar times k times psi. Now x must act upon
this, which just involves multiplying by x,
so the whole first term of the commutator
will be x h bar k psi.
Now the second term of the commutator, p acting
on x acting on psi, is a little trickier,
so let’s compute that separately. As you
can see, first x must act on psi. That gives
us x times psi. Now p must act on the quantity
x times psi, so we have negative i times h
bar, times the partial derivative of x times
psi with respect to x. This time we must use
the product rule for differentiation, which
we are familiar with from our study of calculus,
given that each term in this product contains
x. This will be the derivative of the first
term times the second term, plus the first
term times the derivative of the second term.
The derivative of the first term, x, is simply
one, so that leaves us with just the second
term, psi. Then we have the first term, x,
times the derivative of the second term with
respect to x, which brings i and k down here
to give i k psi, so that’s i k x psi all
together. Now let’s distribute negative
i h bar across this sum to get negative i
h bar psi plus h bar k x psi. Now remember,
this whole expression is the second term in
the commutator, which must be subtracted from
the first, so let’s put it in parentheses
so that we don’t make any careless errors
with sign. And we see that these two terms
cancel, leaving us with i h bar psi.
So, what we are left with is the realization
that the commutator of position and momentum
equals i times h bar. This is the fundamental
commutation relation between position and
momentum. This relation, which is at the core
of quantum mechanics, is actually the source
of the Heisenberg uncertainty principle. We
can see that position and momentum are conjugate
operators, and they do not commute. Whenever
we encounter non-commuting operators, we will
find a limit on the precision with which we
can simultaneously measure the physical quantities
they represent, the implications of which
we discussed earlier in this modern physics
series.
So with that, we understand the concept of
an operator, some basic rules for applying
operators, and how to apply the position and
momentum operators to the wavefunction. Now
it’s time to learn more about the wavefunction,
so let’s move forward and do just that.
