We’ve seen in previous videos several ways
of describing light, and their applications.
The simplest model of light is the ray model,
which could be used for analyzing image formation
by optical instruments such as microscopes
or telescopes.
The wave model explains why we can see fringes
in a double-slit experiment, and it explains
why the resolution of an imaging system is
fundamentally limited by the wavelength of
the light.
Describing light as a form of electromagnetic
radiation explains why light can be polarized.
We saw that for monochromatic light, the x-
and y-components can oscillate independently,
with different amplitudes and phases.
The complex amplitudes of the two field components
can be described by a Jones vector, where
the time-dependence is omitted.
Different Jones vectors correspond to different
states of polarization.
[1 0] corresponds to horizontally polarized
light, [0 1] corresponds to vertically polarized
light, [1 1] corresponds to 45 degrees diagonally
polarized light, and [1 i] corresponds to
left circularly polarized light.
Using quantum mechanics, we can describe light
as discrete packets called photons.
With this model, we can explain why different
types of light have different photon counting
statistics, and it tells us how we can overcome
the shot noise limit by using quantum light.
However, in this model we haven’t yet taken
into account the polarization of the photons.
In the following, we will look at how to model
the polarization of photons quantum mechanically,
and how the polarization could be used for
quantum computing.
In summary, we will see that the polarization
of a photon can be described as a two-level
quantum system.
The Jones vector that we know from the classical
description of polarization can be straightforwardly
interpreted as a two-dimensional quantum mechanical
wavefunction.
We can call the basis ket-vectors H and V,
for horizontally and vertically polarized
light.
This is useful for if we want to keep track
of what is happening physically.
However, if we want to use photon polarization
for quantum computation, it is more convenient
to call the basis states 0 and 1.
Note that this is just a relabeling of the
vectors.
It doesn’t make any difference for the physics.
Once we know how the polarization of a single
photon can be described, we can move on to
describing the state of a system of multiple
photons.
Then, we will see how we can perform quantum
computations by manipulating multi-photon
states.
The main difference between classical computation
and quantum computation is that a classical
computer can only take in one input at a time,
and give one output at a time.
A quantum computer on the other hand, can
take in a superposition of multiple states,
and apply an operation that affects all states
simultaneously.
Let’s see how we can go from the classical
description of polarization to a quantum mechanical
description.
To find information about the polarization
of a light beam, we introduce a polarizer.
In this case, let’s choose the transmission
axis along the horizontal direction.
This means that if we send in horizontally
polarized light, all the light is transmitted.
If we send in vertically polarized light,
all the light is blocked.
If light is polarized at a 45 degree angle,
then the amplitude is reduced by a factor
1/sqrt(2), which means the intensity is reduced
by a factor one half.
Now let’s think of what would happen in
the quantum mechanical case.
If we send in a single photon that is polarized
at a 45 degree angle, then the transmitted
intensity cannot drop by 50% as it would in
the classical case.
The photon has to be either completely transmitted
or completely blocked, but we cannot have
half a photon transmitted, because by definition
a photon is the smallest quantity that light
can come in.
However, if we have many photons, so that
the situation becomes more similar to the
classical case, we can reduce the transmitted
intensity by 50% by saying that each photon
has a 50% probability of being transmitted.
We can describe this quantum mechanically,
by describing the polarization state of a
photon with a two-dimensional wave function.
The basis states denote horizontal and vertical
polarization, just like in the classical description
with the Jones vector.
Therefore, the coefficients of our quantum
mechanical wave function are the same as the
coefficients of the Jones vector.
Recall that according to Born’s rule, the
squared modulus of a coefficient denotes the
probability of a measurement outcome if a
measurement is performed.
So to find the probability that a photon is
transmitted by a horizontally oriented polarizer,
we project the wave function on the state
of horizontal polarization, and we take the
squared modulus.
For 45 degrees diagonally polarized light,
we indeed find the expected transmission probability
of one half.
In general, if the Jones vector has coefficients
alpha and beta, then classically the transmitted
intensity is the squared modulus of alpha.
And indeed, in the quantum mechanical description
we find that the transmission probability
of a photon is also the squared modulus of
alpha.
So to summarize, the coefficients of the classical
Jones vector that describe the state of polarization,
translate directly to the coefficients of
the quantum mechanical wave function.
The transmission probabilities for a certain
orientation of a polarizer is found by projecting
the wave function on the appropriate basis
state, and taking the squared modulus.
So this quantum mechanical wavefunction has
a well understood classical analogue, namely
the Jones vector.
Therefore it is convenient to use this analogy
to better understand in what ways quantum
mechanical wavefunctions are similar to or
different from classical descriptions.
When discussing quantum mechanics or quantum
computing, it is often said that the special
thing about quantum mechanics, is that a particle
can be in a superposition of states.
We can write down the wavefunction of a quantum
bit, or qubit, as the superposition of a 0-state
and a 1-state.
It is often said that this means that the
qubit is 0 and 1 at the same time, because
if you measure the qubit, the measurement
result can be either 0 or 1, and there is
no way to tell up front what the outcome will
be.
However, we saw in the context of photon polarization
that this superposition simply corresponds
to a diagonally polarized photon, and that
it may or may not be transmitted by a horizontally
oriented polarizer.
Does it therefore make sense to say that the
photon is both horizontally and vertically
polarized at the same time?
Not really.
The photon is simply diagonally polarized,
and this could easily be verified by putting
the polarizer in a diagonal orientation.
In that case, the photon will always be transmitted,
and there is no uncertainty whatsoever in
the measurement.
‘Superposition’ is a concept that is well
known in classical wave mechanics as well.
It means that if two different waves overlap,
the total resulting wave is the sum of the
two separate waves.
Quantum mechanics parts ways with classical
wave mechanics when it comes to the act of
measurement.
In quantum mechanics, if you measure a wavefunction,
the outcome is fundamentally probabilistic,
and the wavefunction collapses in a way that
depends on the measurement outcome.
That is something that we don’t see in classical
wave mechanics.
Another point of attention is the phase of
the coefficients of a wave function.
Let’s say a qubit is in an equal superposition
of 0 and 1.
If we want to experimentally determine the
state of the qubit, we have to perform measurements
in one way or another.
So the only thing that we can physically access
are the measurement probabilities, which are
given by the squared modulus of the wave function
coefficients.
But if we can only physically access the modulus
of the coefficients, then what is the physical
meaning of the phase of a coefficient?
For example, if we change 1/sqrt(2) into minus
1/sqrt(2), or i times 1/sqrt(2), the measurement
probabilities stay the same.
So how are these states different physically?
The answer is straightforward if we interpret
these wavefunctions as photon polarization
states.
We see that the first state corresponds to
45 degrees diagonal polarization, the second
state to minus 45 degrees diagonal polarization,
and the third state to left circularly polarized
light.
So the physical difference between the states
is clear.
But how do we distinguish them using measurement
outcomes?
We can easily distinguish the states using
a polarizer in the 45 degrees diagonal orientation.
In this case, photons in state 1 will always
be transmitted, photons in state 2 will never
be transmitted, and photons in state 3 will
be transmitted half of the time.
Mathematically, what we did was describing
the states in a different basis.
We first wrote them in a horizontal-vertical
basis, but by rotating the polarizer, we can
more conveniently describe the states in the
plus 45 degrees – minus 45 degrees basis.
To apply this basis transformation, we first
express the +/-45 degrees vectors in terms
of H and V vectors.
Then, to find the coefficients of the wavefunctions,
we take the inner product of the wavefunctions
with the new basis vectors.
Once we have expressed the wavefunctions in
the new basis, we see that the phase of the
coefficients in the old basis affect the modulus
of the coefficients in the new basis.
So in summary, the phases of the coefficients
become relevant when we change the measurement
basis.
As a final comment, let’s look at pure and
mixed quantum states.
In quantum computing, one major problem is
quantum decoherence, which means that the
state of the system cannot be described by
a quantum wavefunction anymore.
This may seem odd at first: we’ve always
described quantum systems using wavefunctions,
why wouldn’t we be able to do so anymore
all of a sudden?
But when we interpret the wavefunction as
a polarization state, it actually makes sense.
We can describe polarized light using a Jones
vector, but not all light is fully polarized.
Light can also be partially polarized, or
unpolarized.
Classically, partial polarization is explained
by assuming there are rapid phase fluctuations
between the x- and y-components of the electric
field.
If these fluctuations are correlated, then
the phase difference is fixed, and the polarization
state is well defined, so we can describe
the polarization state with a Jones vector,
or a quantum wavefunction.
But the less correlated the fluctuations are,
the less well defined the polarization state
is.
So it is the correlation between E_x and E_y
that determine the degree of polarization.
We can describe this mathematically by introducing
an object called the coherency matrix.
The diagonal elements indicate the intensities
of the x- and y-components.
In case of fully polarized light, the off-diagonal
elements indicate the phase difference between
the two field components.
If the light is partially polarized, the off-diagonal
elements give the correlation between the
field components, which is a measure for the
degree of polarization.
For fully polarized light, the coherency matrix
is given by the outer product of the Jones
vector with itself.
Similarly, in quantum mechanics some systems
cannot be described with a wavefunction.
In that case, the system is in a mixed state,
and it has to be described with a density
matrix.
If the system can be described by a wavefunction,
it is in a pure state, and the density matrix
is then given by the outer product of the
wave function with itself.
To summarize, in classical optics a polarization
state can be described with a Jones vector,
while a two-level quantum state is described
by a two-dimensional wave function.
What classically would be a well-defined polarization
state, we would call in quantum mechanics
a pure state.
A partially polarized state corresponds to
a mixed state.
The coherency matrix that in classical optics
can be used to describe a partially polarized
state corresponds in quantum mechanics to
a density matrix.
We have seen in a previous video that polarization
states can be described by points in the Poincare
sphere.
Fully polarized states lie on the shell of
the sphere, and the less polarized the state
is, the closer it lies to the center of the
sphere.
The equivalent in quantum mechanics is called
the Bloch sphere.
We now know how to describe a two-level quantum
system, which can serve as a quantum bit of
information, or qubit.
But if we want to perform quantum computations,
we will need more than just a single qubit.
So let’s see how we can describe quantum
systems consisting of multiple qubits, which
we for now assume to be polarized photons.
Let’s say the first photon is in a state
described by coefficients alpha and beta,
and the second photon is in a state described
by coefficients gamma and delta.
We now want to describe the entire system
of two photons using a single quantum wavefunction.
This new wavefunction is given by what is
called the tensor product of the two single-photon
wavefunctions.
To see what this means, we can write down
the product of the two states, and expand
it.
What we now have is a four-dimensional vector,
where each basis vector describes a polarization
state of both photons.
Now let’s write the single-photon states
as column vectors.
From the expanded tensor product, we can directly
read what the column vector of the two-photon
state looks like.
Note that to order the entries of the vector,
we interpret the photon states as binary numbers,
and put them in ascending order.
We can remember how to take the tensor product
as follows: take the first entry of the first
vector, multiply it with the second vector,
and put it in the product vector.
Then take the next entry of the first vector,
multiply it with the second vector, and put
in the product vector.
If we take the tensor product of higher dimensional
vectors, it is straightforward to see how
this procedure is generalized.
So now we know how to take the tensor products
of vectors, and how we can use them to describe
multiple subsystems as one larger overall
system.
Now suppose we want to apply some operations
to the subsystems, and describe them as a
single operator acting on the overall system.
To do this, we need to look at the tensor
product of operators.
Let’s say we apply an operation U to the
first photon, and an operation W to the second
photon.
By writing out the tensor product of the two
vectors, we know what the final total state
should look like.
If we write out the expression, we see a pattern
emerge.
In the top left we have the entry U1 multiplied
by the matrix W. In the top right the entry
U2 is multiplied by W. In the bottom left
we have U3 times W. And in the bottom right
there’s U4 times W. So we can write down
the operator acting on the complete system
in a straightforward manner.
Just like the tensor product of two vectors,
the tensor product of two operators is found
by taking each entry of the first matrix,
multiplying it with the entire second matrix,
and placing the result in the product operator.
Now let’s see what happens to the complete
system if we perform a measurement on a subsystem.
Let’s say we measure the first photon and
find it in state 0, that is, it’s horizontally
polarized.
Then the state of the second photon is found
by projecting the total state on the 0 state
of the first photon.
Similarly, if we measured the first photon
and find it in state 1, so vertically polarized,
then we find the state of the second photon
by projecting the total state on the 1 state
of the first photon.
In this case, we find that no matter the outcome
of the measurement on the first photon, the
second photon will always end up in the same
state.
This makes sense, because we defined the initial
state of the total system as the tensor product
of two completely independent subsystems.
Just because we wrote down a different mathematical
expression doesn’t change the fact that
the two subsystems are independent.
But now let’s write down another state that
cannot be written as the tensor product of
two independent states.
Now if we measure the first photon to be 0,
the second photon must also be 0.
And if we measure the first photon to be 1,
the second photon must also be 1.
So if we were to measure the second photon,
the outcome depends on the measurement outcome
of the first photon.
In other words, one would observe a correlation
in the measurement outcomes.
When this is the case, we say the two photons
are not two independent objects with separate
states anymore, but rather, they share a single
entangled state.
Unentangled states are known classically as
well: simply take two polarized light beams,
each with their own Jones vector, and take
the tensor product of the two vectors.
Then, you have an unentangled state that is
the superposition of four two-bit states.
But having an entangled state for two spatially
separated particles is a quantum phenomenon.
Let’s look at some other examples.
Consider the equal superposition of all two-bit
states.
This is an unentangled state, because you
can factor it into two equal superpositions
of one-bit states.
But now let’s change one plus into a minus.
It may look like a minor change, but if we
now measure the first photon, we see that
depending on the measurement outcome, the
second photon is either +45 degrees diagonally
polarized, or -45 degrees diagonally polarized.
So by introducing a minus sign, the state
has become entangled.
For the final example, consider once more
the equal superposition of all two-bit states.
We can rewrite the coefficients, and then
change two of them slightly in a way such
that their squares still add up to one.
Now we see once more that we have created
an entangled state, since the state of the
second photon depends on the measurement outcome
of the first photon.
However, the two states only differ slightly,
so you wouldn’t notice much of the entanglement.
What this shows is that entanglement isn’t
a binary state; that two particles either
are or aren’t entangled.
But rather there is a continuous spectrum,
and one can speak of higher and lower degrees
of entanglement.
For our first example of quantum computation,
we look at Grover’s algorithm.
Let’s first understand the problem the algorithm
tries to solve.
Suppose we have a list of names, and the names
are not sorted in any particular order, such
as alphabetically.
We can denote each entry with a number from
1 to 8, or, to more clearly see the relation
with the qubit states, we can denote each
entry with a binary number from 0 to 7.
The problem that we’re trying to solve is
to find the number corresponding to a particular
name on the list.
Classically, the best we can do is go through
the entries one by one.
So first we check entry 0 to see if that’s
the name we want.
If it isn’t, then we try the next entry
of the list, and we go on like this until
we find the right name.
So the number of function evaluations we need
to perform can range anywhere from 1 to 7,
depending on luck.
Now we want to see whether we can do any better
using quantum computation.
The first thing we need to do is to define
the following function: the input state is
multiplied by -1 if it corresponds to the
correct name.
Otherwise it remains unchanged.
So if our input state is 000, then the output
state is also 000.
The same goes for the other states, except
state 100, for which the function will return
– 100.
Now comes the trick: instead of trying one
state at a time, we create as input a superposition
of all states.
After evaluating the function only once, the
output state already contains the information
about which state is the correct state.
However, we cannot extract that information
from the output state as it is currently.
If we were to measure this state, the outcome
could be any random state with equal probability.
So we need to perform some extra steps.
More specifically, we need to manipulate the
wavefunction so that the coefficient for the
100 state has a high magnitude.
That would mean that if we perform a measurement,
we would find the correct state 100 with high
probability.
To do this, we introduce an operation that
involves inverting around the mean.
We can plot the coefficients of the wavefunction,
and indicate the mean value in the plot.
To invert around this mean, we subtract the
mean, flip the sign of all the coefficients,
and add back the mean.
We see that by applying this operation, the
coefficient for the 100 state now indeed has
a higher magnitude.
Just to verify whether this operation is indeed
a valid quantum operation, we can add the
squares of all coefficients to see that they
indeed still add up to 1.
Now let’s repeat what we’ve done, but
now we also write down the mathematics.
The first thing we did was generating an equal
superposition of all states.
We can write this as a column vector.
Then we flipped the sign of the entry we’re
looking for.
We can write this as a matrix operation, which
we call U omega.
This operator is also known as the oracle,
because it contains the answer to our problem,
but we don’t know a priori what this operator
actually does.
The next step was to invert around the mean.
First we compute the mean, which can be done
by taking an inner product.
By defining the vector s, we can write the
inner product more concisely.
To obtain a vector filled with the mean value
mu, we can multiply mu by the vector s.
Now that we have calculated the mean, we can
subtract it from the wavefunction.
Then we flip the sign, and then we add back
the mean again.
Now we can write the inversion around the
mean compactly as an operator U_s, which is
also called the Grover diffusion operator.
So now we have found two operators: one that
evaluates the list of names, and one that
inverts the wavefunction around the mean.
Grover’s algorithm can then be written as
follows: we start with a wave function that
is the equal superposition of all entries,
and then we repeatedly apply the Grover iteration,
which is composed of the oracle operator followed
by the diffusion operator.
We need to keep iterating until the magnitude
of the coefficient of the correct entry is
maximal.
If we then measure the wavefunction we will
find the correct entry with high probability.
The number of required iterations depends
on the length of the list.
If we have 3 qubits, then the list is 2^3=8
entries long.
If the list has N entries, then it can be
shown the algorithm requires of the order
of square root N iterations.
So if the list has a million entries, then
using quantum computation we require only
a thousand evaluations of the list, whereas
classically we could require up to a million
evaluations.
Now let’s think a bit more about how quantum
mechanics helped speeding up the solution
to the problem.
We started with a wavefunction that is a superposition
of all inputs.
While this is an important step in a quantum
algorithm, generating the superposition is
not something inherently quantum mechanical.
If you have two diagonally polarized classical
light beams and want to describe them using
a single vector, you also get an equal superposition
of all states.
The quantum mechanics comes in when we check
the list and flip the sign of one entry.
When we do this, we create entanglement between
the two photons.
As we invert around the mean and apply more
iterations, the state keeps being entangled.
For our second example of quantum computation,
we look at Shor’s algorithm.
The problem we’re trying to solve here is
factoring a large number N.
So if N is the product of two prime numbers
p and q, we need to find p and q while we
only know N. To solve the problem, we take
the following steps.
First we reformulate the problem.
This step is completely unrelated to physics
or quantum mechanics; it is purely mathematical.
By applying some mathematical theorems, we
can demonstrate that we can find the factors
of N by generating a series of numbers, and
then finding the period of that series.
Then, quantum computing is used to find the
period of the series more efficiently.
Classically, we only compute one number at
a time until we find some repetition.
But quantum mechanically, we can evaluate
the entire series in one go.
The problem then is that even though the relevant
information is present in the quantum wavefunction,
we cannot extract it yet by performing a measurement.
Rather, we first have to apply a Quantum Fourier
Transform to the wavefunction, and then we
can perform a measurement that reveals the
period of the series.
So let’s see how we can reformulate the
problem.
As an example, let’s take N=209, which has
as prime factors 11 and 19, which we aim to
find.
What we need to do is pick some random number
‘a’.
In our example, let’s pick a=12.
Now we’re going to create a series of numbers,
where the kth number is given by a^k modulo
N. What this means is that we raise ‘a’
to the power k, and calculate the remainder
if we divide it by N. Let’s calculate this
series for several k.
For k equals 0 we get 1.
Then we get 12, 144, 56, 45, 122, and then
1 again.
If we were to continue this series, we would
find the numbers repeat themselves, with a
period of 6.
Let’s prove that indeed the series is periodic,
and the period is given by that k for which
a^k mod N equals 1.
The periodicity of the series follows from
the observation that the value of s_k determines
the value of s_k+1, which we will now demonstrate.
If we know s_k, then s_k+1 is given by ‘a’
times s_k mod N.
This is because s_k is the remainder of a^k
divided by N, which means there is some integer
m such that a^k equals N times m plus s_k.
To calculate a^k+1, we multiply both sides
by a.
By definition, s_k+1 is given by the remainder
of a^k+1 divided by N.
If we substitute our expression for a^k+1,
we can eliminate any multiples of N, since
they don’t affect the remainder.
We then find what we were trying to prove,
namely that s_k+1 is directly determined by
s_k.
What this means is that if s_6 equals s_0,
then s_7 must equal s_1 because their previous
numbers are the same.
But if s_7 equals s_1, then s_8 must equal
s_2, because their previous numbers are the
same.
We can keep continuing this argument indefinitely,
from which it follows that if a certain number
appears twice in the series, it must appear
periodically.
In particular, we know that for any ‘a’
and any N, s_0 must always be 1.
So if r is such that s_r equals 1, then the
series must be periodic with period r.
Now the question is, how is this knowledge
going to help us find the prime factors of
N?
We now know that if the series is periodic
with period r, then a^r mod N must equal 1.
We can subtract 1 from both sides, and then
factorize the expression.
This equation says the number on left has
a remainder of 0 when divided by N, which
means it must equal some integer m times N.
We furthermore know that N equals p times
q, which are the two numbers we are trying
to find.
For the sake of notation let’s call a^r/2+1,
A, and let’s call a^r/2-1, B.
If r is an even number, then A and B are integers.
‘r’ will not always be an even number,
but let’s assume that we’re lucky enough
so that it is even.
In our example, r equals 6, so indeed it is
even.
Now we have to be lucky a second time, and
assume that neither A nor B are multiples
of N.
If indeed they are not, A must contain one
prime factor of N, and B must contain the
other prime factor of N.
If these expressions for A and B are valid,
then it follows that the largest number that
divides both A and N is p, and largest number
that divides both B and N is q.
In other words, greatest common divisor, or
gcd, of A and N is p, and the gcd of B and
N is q.
The gcd of a pair of numbers can be found
straightforwardly by using the Euclidean algorithm,
which works as follows.
If we want to find the greatest common divisor
of 1729 and 209, we first calculate that 1729
mod 209 equals 57.
Then we claim that the gcd of 1729 and 209
equals the gcd of 209 and 57.
Let’s demonstrate that this is indeed the
case.
The fact that 1729 mod 209 equals 57 means
that 1729 equals some integer times 209 plus
57.
Now let K be the greatest common divisor of
1729 and 209.
That means that 1729 equals some integer times
K, and 209 equals some other integer times
K, and these two integers cannot share any
common factors.
By rewriting the equation, we see that K must
also be a divisor of 57.
So K divides both 57 and 209.
Furthermore, we can demonstrate that m-8n
shares no common factors with n.
To see that this is true, let’s assume the
opposite, namely that they do share a common
factor f.
That means that dividing the numbers by f
gives two integers, and because 8 times n/f
is an integer, then m/f must be integer as
well.
But that means that m and n share the common
factor f, which contradicts our earlier assumption
that m and n share no common factors.
Therefore, indeed m-8n shares no common factors
with n.
And from that it follows that the greatest
common divisor of 209 and 57 equals K, which
was assumed to be the greatest common divisor
1729 and 209.
So we see that indeed the gcd of 209 and 57
equals the gcd of 1729 and 209.
We can repeat this logic to find it is also
equal to the gcd of 57 and 38, which is equal
to the gcd of 38 and 19, which is 19.
And indeed, 19 was one of the prime factors
we were trying to find.
So let’s summarize how we reformulated the
factorization problem.
We picked some random number ‘a’.
We calculated the series a^k mod N, and found
the period r of that series.
Then, using ‘a’ and r, we calculated a
new number a^r/2 plus or minus 1, and the
greatest common divisor of this new number
and N yields with decent probability a prime
factor of N.
So far, we have only done mathematics, and
no physics or quantum mechanics whatsoever.
The quantum mechanics comes in useful when
we want to find the period of the series.
Classically, we must compute the series one
number at a time.
But with quantum computing, we can do it more
efficiently.
So let’s see how we can efficiently find
the period of the series using quantum computing.
We need to introduce two registers of qubits.
For our example, let’s say that the registers
are ten qubits large.
We set the first register to be in a maximal
superposition of all 1024 states, and we set
the second register to be in the ground state.
Note that technically these quantum states
should be normalized so that the squares of
all coefficients add up to 1, but for the
ease of notation we will omit this normalization
constant.
We can relabel the states by converting the
binary numbers to decimal numbers.
So we have two registers, one in a state of
maximal superposition, the other in the ground
state.
We can write down the wavefunction of the
entire system by taking the tensor product
of the wavefunctions of the two registers.
Now we need to apply modular exponentiation
to all the states of the superposition.
When we apply this to one particular term
of the superposition, the state of the first
register remains unchanged, while the second
register is mapped to a^k mod N.
If we consider again our example of N equals
209, and ‘a’ equals 12, we find a wavefunction
where the labels of the states of the first
register go from 0 to 1023, and the labels
of the state of the second register repeat
with a period of 6.
So we see that the wavefunction already contains
the information we want to find, namely the
period of the series.
Now we need to manipulate the state such that
we can extract this information.
To do this, we first apply a measurement to
register 2.
We will get one of six possible outcomes,
and depending on the measurement outcome,
register 1 will collapse to a certain state.
For example, if we measure register 2 to be
in state 1, register 1 collapses to the series
0, 6, 12, etcetera.
If we measure register 2 to be in state 12,
register 1 collapses to the series 1, 7, 13
etcetera.
And the same goes for the other possible measurement
outcomes.
But whatever the measurement outcome for register
2, we know with certainty that register 1
will collapse to a superposition of states
whose labels have interval 6.
Let’s assume that we measured register 2
to be in state 144.
Then register 1 collapsed to the superposition
of 2, 8, 12, etcetera.
We see that register 1 contains information
about the period that we try to find, but
we still cannot extract it with a straightforward
measurement.
If we were to measure register 1, we would
get one random number from the superposition,
say 62, and then all we have calculated is
that a^62 mod N equals 144.
Instead, we have to do something extra: we
have to apply a Quantum Fourier Transform
to register 1.
For functions of a continuous variable x,
the Fourier transform is very effective in
finding periods.
For example, if a function f(x) consists of
a series of delta peaks that are a period
r apart from each other, then the Fourier
transform of f(x) yields a series of delta
peaks with a period 1/r apart from each other.
So the position of a delta peak in f-hat of
omega gives information about the period of
f(x).
So if we could apply some sort of Fourier
transform to the wavefunction of register
1, we would get a wavefunction that we could
measure to directly get information about
the period r.
So ideally, if we applied some sort of Fourier
transform to the wave function of register
1, we would get a wavefunction that is a superposition
of 0/6, 1/6, 2/6, etcetera.
But what would a state with label 5/6 physically
mean, in terms of the 10 qubits that you measure?
In the end, you measure a sequence of ten
zeroes and ones, which you can interpret as
a binary number, and then convert to a decimal
integer number.
How would you interpret such an integer number
as a fraction like 5/6?
To find out, let’s look in more detail at
what it means to take a quantum Fourier transform
of the wavefunction of register 1.
Let’s denote the wavefunction as psi, and
its quantum Fourier transform as psi-hat.
The coefficients of psi and psi-hat are related
by a discrete Fourier transform.
If we compare this to the expression of the
Fourier transform of a function of a continuous
variable, we find that the continuous Fourier
variable omega is analogous to the discrete
Fourier variable k-prime over K.
So if we apply a quantum Fourier transform
to register 1, measure the resulting wavefunction,
and find a value of 853, we can interpret
that value as the Fourier variable being approximately
853/1024.
So we know that ideally the quantum Fourier
transform of register 1 should peak at multiples
of 1/6, but in practice, it is only defined
at multiples of 1/1024.
Since 1/6, 2/6, 4/6, and 5/6 cannot be expressed
as integer multiples of 1/1024, the amplitude
leaks out to adjacent states.
If we now measure register 1, there is still
a probability that we obtain a useless outcome,
such as 0, or a number where the amplitude
leaked into.
But let’s suppose we measure a useful outcome,
such as 853/1024.
How do we find out that 853/1024 is supposed
to approximate 5/6?
The way to find out is by using continued
fractions to obtain increasingly more accurate
approximations of 853/1024.
We can check for each of these approximations
whether it solves our factorization problem.
So let’s see how we find approximations
from continued fractions.
We start with 1024/853, and split of the integer
part, which is 1.
We rewrite the remaining fraction as 1 divided
by its reciprocal.
Then we again split of the integer part, which
is 4, and rewrite the remaining fraction as
1 divided by its reciprocal.
And we can repeat this step again.
To find approximations of 1024/853 , we can
truncate this sequence at different points.
If we truncate it at the first step, we find
the approximation 1.
If we truncate it at the second step, we find
the approximation 5/4.
If we truncate it at the third step, we find
the approximation 6/5, and it is this approximation
that we were looking for.
This completes the explanation of all the
steps of Shor’s algorithm.
To summarize: we want to find the prime factors
of N. To do this, we pick a random number
‘a’, which we use to generate a series
of numbers.
We find the period of this series using quantum
computation.
This is done by introducing two qubit registers.
The first one is in a maximal superposition
of all states, while the second one is in
the ground state.
We apply modular exponentiation to the second
register, and then measure the second register,
so that the first register collapses to a
superposition of states whose interval is
the period r.
By applying a quantum Fourier transform, and
then measuring register 1, we find an approximation
of a multiple of 1/r.
To find r from this approximation, we use
continued fractions.
Once we successfully find r, we find the prime
factors of N by computing the greatest common
divisor of a^r/2 plus or minus 1 and N using
the Euclidean algorithm.
It is important to note that Shor’s algorithm
doesn’t guarantee to always successfully
return the correct prime factor.
But it does so with a decent probability,
and because the algorithm is computationally
efficient, it is acceptable to run the algorithm
several times until you get the correct outcome.
