- [Elisa] Now we're gonna continue
with amplitude amplification,
which is another kind of
straight off way of thinking
to see how Grover works.
So, that's actually the general idea
behind Grover's algorithm.
So, to see that, we will have a look
at each of the three steps
that I defined before.
Let us have a look
at each of the amplitudes.
So, remember we have the first step
where we prepare the equal superposition.
Then the second step, where
we apply our oracle, UF.
And then the third step, where we apply V.
So, the first step,
where we create the equal
superposition state.
Now, okay draw the amplitudes.
So, let's say we have the state zero
and then one
and then on, we have the element W.
And then the element for
the other elements, N.
And here we're plotting the
amplitude for each of them.
So, now, in the beginning if we create
this super position state,
we have equal super position,
so they're all in the same here,
they all have the same amplitude.
And amplitude, it's not the
probability to measure them,
but just the prefactor of the state,
it's actually a third of the probability.
So, it will be one over square root of N.
Which is two to the small n,
for each of them.
Each element has the same probability,
and the same amplitude.
And we have no different phases.
The phase is always zero between them,
that's why the amplitude is,
you can just write the
amplitude like that.
So, the average amplitude
is also just 1 over square root of N.
Okay, then we have the second step.
The second step is
where we apply our oracle,
which is identity minus 2 times
the projection onto W,
applied to our superposition state.
Let's draw another diagram.
Can have the amplitude here,
and we have the states, zero, one,
and then on, we see W,
and then the other states.
So, now what does this operation do?
What we know is that
on any state, except for the state W,
it does not do anything,
it just acts as identity.
So we we still have the
same amplitude as before.
Well, on the state W,
for that state, it flips the amplitude.
Because we have now, when
we define the functions,
we saw that it projects W
when we apply the
unitary F to our state W,
we will get the state, minus W.
So instead of plus, we
are here in the minus.
The absolute value is the
same but still we're here now.
So, here we have 1 over square root of N.
And here we have minus
1 over square root of N.
Which means that the
average of all amplitudes
is now a bit lower than before.
It is one minus 2 over N.
Times 1 over square root.
Not so important but yeah,
all of this matters,
but I think it's clear
that the average now drops
because we flipped one of the states.
So then we have the third step,
and what we do in the third step,
we apply our unitary V,
also called the diffuser.
which is two times projection
onto S minus identity,
onto the state that we had before.
Again we have the states, zero, one, on,
W,
then N minus one.
And now what I claim is that
the effect of the unitary V
is that it reflects
all amplitudes
about the average amplitude.
I want to show you why this holds.
So, assume we have an arbitrary state,
now we really don't need to
care about what our state is,
but just look at an arbitrary state.
So, i, as an arbitrary pure state,
can be written as the
sum over some state i,
that have the amplitude of i.
Then, if I apply V on this side
this yields the state
2 times the projection
onto S minus identity.
Apply to psi.
Now if I want to write
the projector onto S,
then I get twice this 1
over square root of N.
Normalization factors,
I can write one over N.
I get the sum over all states J,
that's the sums from
over all 2 to the N different.
So the position states, for the kits,
but then also for the graphs.
But I have to do another
sum because in both S's
are both the sum over all states.
I think you would take
two different indices
and sum over both of them.
And then this, I...
Let's move over to the next line.
This, I apply to my state, psi.
So, to the sum over all i,
alpha i, i,
minus, and then identity applied
to psi just gives me psi.
So just minus that state.
Now,
I will rewrite it a bit.
What I know is if you can
get the prefactors of the i,
I can just get them out.
Just move them to the front.
Or, actually what I do is,
I can then see if I sum over all...
Sorry.
If I get the prefactors out,
I have the inner product
of K and i.
The bra and the ket, or the opposite way.
Sorry, yes. The bra of
K, times the ket on i,
which is the inner product of
states that are, in general,
all orthogonal, except
for when K equals i.
Whenever K is unequal to i,
then it means we have
different phaser states,
so they're all orthogonal
and this inner product gives me zero.
So in the end, I don't
need to sum over all i,
but the only i that
gives me a non-zero term
is the one where i equals K.
In this case, I will get one.
So, the inner product with the
sum of all i just disappears
and the inner product of K and
i is then one in this case.
So, what I'm left with
is the sum over all K,
alpha K,
over N.
And also I still have the sum over all J.
Okay? Hope this was clear now.
Don't worry of it's too math
heavy, then just skip this part
and just believe me that
it reflects the amplitude.
So you can also just skip this part,
believe me it reflects the amplitude,
I just wanted for those
who are interested,
I wanna very quickly show why
and how one see that it
actually reflects the amplitude.
For those of whom this
is too mathematical,
just don't worry.
Just listen again in two minutes.
And then, for the second term here,
I just change the index
that I'm summing over to J,
so that, I can do that, right,
I'm just summing over it.
And now I put these two terms
together and I'm getting...
So this part is now the average over
average amplitude over
minus alpha J.
Let's just say it's J.
And taking two times the average minus
an individual alpha J
corresponds exactly to
a reflection of alpha J
above the average alpha.
Which one can see,
again very quickly,
so if alpha J equals,
I can write every alpha
J as some average alpha
plus some delta J.
And then if I look at two
times alpha minus alpha J,
it would be exactly alpha
minus delta J.
So instead of alpha plus J I'm
having alpha minus delta J,
so I'm having exactly the reflection
above the average alpha.
So, done with math part, sorry.
So, what I just showed is that applying V
corresponds to reflecting
on the average amplitude.
So we said our average amplitude was here,
now I'm reflecting all the states
that are not the winning state, W.
Then I'm going down somewhere here.
Well if I'm reflecting the
one that is the state W,
I can see that increases by lot now
because we have to add all that up.
So actually we get somewhere here.
And I can calculate these
values here and here,
I don't think that matters too much.
You can see them in the notes
but I think it's clear that
during this you will decrease the radius
of all other elements by a lot,
or by a bit at least, and
then increase the amplitude
of the winning element by a lot.
And then what we do in Grover's algorithm
is we repeat steps two and three.
And so, by repeating steps two and three
the amplitude
of W
will increase even further
which is why we call that
principle amplitude amplification.
'Cause the amplitude of this
element that we're looking for
it gets amplified by a lot,
and then of course
we learned that the probability
to measure the state is
given by the absolute value of
its amplitude and N squared.
So if we have a high amplitude,
that means that we also
have a very high probability
to measure that state,
while other states will have
a lower and lower probability
will be measured.
Okay, so, now I would again
have time to answer very quickly
maybe one question.
Afterwards I will just
quite quickly mention
how we treat the case
when we have more than one marked element,
and then how implement them in each case.
- [Bryan] Actually instead of
answering a question this time
Elisa, I'm gonna throw
you a curve ball here
'cause I've been watching
the chat pretty intently
and what I see is kind
of this back and forth
between some people that are saying
this is really difficult, a lot of math,
and other people saying, you know what?
It is, but it needs to be,
and I saw one person say,
and I'm sorry I've forgotten
what you username was,
but they said, you know,
this is a work of art,
and works of art are meant
to be understood over time.
And I thought was real
great way to say it,
so Elisa, I just want to give
you maybe a second or two here
or you know, as much time as you want,
to just encourage those people
that are finding this difficult to follow.
You know, give them some
type of encouraging,
just note, to help them stay
focused and keep trying.
- [Elisa] Yes, so I'm very
happy to see so many people
attending and trying to follow
even though not all of you
have a strong mathematics
or physics background,
and I mean the stuff that I'm teaching now
is what you would learn, or
the stuff that we're teaching
during the whole summer
school is something
that you would learn maybe over the course
of a whole semester at university.
So it really covers a lot of content.
But yeah we thought that it's
better to cover more content
and then people can re-watch the videos
and look at the details at
home when they have more time
and to slow it down too much,
so it's on purpose very, a
lot of content, I know that.
And I'm really sorry for those
of you who are struggling
with following me now.
But I really would just encourage you
to look at it again
when you have more time.
Ask questions, discuss
it with your colleagues.
It is a lot of content,
so really don't feel bad if
you cannot follow everything,
and I do also assume that
people have some math,
a bit of math background at least,
so if you don't have that then of course
it's gonna be much harder for you as well.
And yeah, it is a lot.
- [Bryan] That's a reminder as well,
we're proud of all of
you for even being here.
I mean this is, many of you have families,
have jobs you're working full time,
have many other responsibilities
that you're going out
and dealing with on top
of being in this course,
we understand that.
And it really is inspiring
to have so many people
tune in everyday and just
try their best at this.
- [Elisa] Yes.
Thank you, Bryan.
- [Bryan] That's a great way--
Yeah thank you, Elisa.
I think that's a great way
to lead into the tail end of the lecture.
- [Elisa] Okay, so yeah now
it's the last bit of theory.
It's not gonna take long.
I'm just gonna give you a
very quick generalization
of Grover's algorithm for the case
where we have more than
one marked element.
So, so far we were saying
we have one element
that is the winning
element, but for example,
just say we're looking at some
problems anyone has solved
that have different constraints,
there might be more than
one solution in the end.
So when we have M marked elements,
I mean the winning elements,
the ones that satisfy all constraints,
which I now call W i.
Then what we do is we
define the winning state
as the superposition of all these
marked elements.
So we have to normalize it now.
So this is just an equal
superposition of all of them,
and that also means that
our orthogonal element,
the W orthogonal,
now has a slightly different
normalization factor
because we have M elements
that are not included in it.
So it has one over
square root of N minus M,
instead of N minus one
where we had before.
Apart from that it doesn't change,
and while of course if
we now sum over all X,
we sum over all X except for all W.
All those that are part of
that are in one of the winning elements,
such as, all others.
Sorry, all that are not in there.
So, if we then write down S
as the sum of W and W
orthogonal as we did it before,
all the changes is again there.
Normalization factors, we
have instead of N minus 1,
we have N minus M here.
And instead of one over
N, we have M over N,
the second term.
But again we just define it
by cosine of theta over two,
plus sine of theta over two.
And then the whole algorithm
works exactly the same way
as described before.
We will also rotate by theta every time,
however, now the angle, theta, changed
because now sine theta is
square root of M over N,
while before it was
square root of 1 over N.
So what that means is that our
angle, theta, becomes larger.
So, as we rotate every single
time by an angle, theta,
that means that if we have
the same number of elements N,
but now we have M marked elements,
we will actually be faster.
We can see that also when we determine R,
the number of rounds that we have to do
is this ugly formula, pi
over 4 times arcsin of,
and now we have square root
of not 1 over N, but M over N,
minus 1/2, which scales S up,
(microphone skips)
N over M.
Which tells us to say,
if M is larger than N is,
so if, for example, assume
every second element
would be marked, then this
would be a super small number.
But okay, maybe if every second element's
not what we're looking
for but if we increase M,
our algorithm would even
pass it then before.
Which we can also see
when we look at the
amplitude amplification.
So, in the first step
everything stays the same,
all amplitudes are 1
over square root of N.
There's state zero, state one,
then we have state omega one.
Then we have state omega two.
I'm not gonna plot more of them now
but you could of course do even more.
And then we have state N minus one.
And so they all have the same amplitude.
And if we go the second step,
we'll now not only flip this one element
so the normal states, the
ones that are not marked,
will stay the same.
But then the state, omega
one, will be flipped,
and the state, omega two,
will also be flipped.
Actually all marked
elements will be flipped.
Which means that the average now
is even, okay maybe too low,
but the average now is even
lower than it was before.
So if you want, I can
give you the formula,
it's 1 minus 2M over N,
times 1 over square root of N.
So before, M was just one.
And we can see that if M is smaller,
the average will be
lower, which makes sense
because we've flipped more amplitudes.
Which then implies that for the last step,
if we now flip them all on
an even lower amplitude,
then these will become
even smaller than before
while we have two that are increasing now.
They will not be as high as
the single element before
but since we have two of them,
we have a higher number of getting
one of the marked elements now.
So, hold on.
What you can just see
is that it goes faster
because the average decreases by much more
every single time.
Oh yeah, that's just
another way to see it.
Now I'm gonna show you how
to do that with qiskit,
also for multiple marked elements.
I wanted to quickly have mentioned that.
And share screen.
Can you see my screen already?
Bryan.
- [Bryan] Yip, I'm sorry
let me focus that for you.
And, there you go, now we should see that.
Just zoom in a little bit
and we should be good to go.
- [Elisa] Sorry, what did you say?
- [Bryan] Sorry, just
zoom in a little bit.
- [Elisa] Oh, zoom in, yes, okay.
- [Bryan] Thank you so much.
- [Elisa] Like this? Is that fine?
- [Bryan] That's good.
- [Elisa] Okay, so, S for the
Deutsch's algorithm, again we're gonna--
We start by importing another
function, importing numpy,
importing QuantumCircuit, IBMQ,
it's also like the standard
things that we always import
and we load our IBMQ account.
Then, now we have to
define again our oracles.
So this time, I'm not
explicitly defining the oracles
as I did for the Deutsch's algorithm
because that gets too complicated now
if I increase my N and if
I have a specific number
of specific marked elements.
So, what I'm doing instead is, okay,
first we start by creating
again a quantum circuit,
Where can give it a name
and the name by default will be oracle.
That's just what you will see later on
in the quantum circuit again.
And then I'm creating a matrix.
I'm creating a matrix
that acts as the oracle,
and what the matrix does
is apply the minus sign to,
it applies a minus one to the,
or what it's supposed to
do is in the end it should
apply a minus one to the marked elements
and just plus one to every other element.
So in all other elements
it should act as identity.
So what we want in the end
is we want a diagonal matrix
that has ones
on all diagonal elements that are
not the marked ones.
And all marked ones we
want to have a minus one.
So we create the matrix, oracle_matrix,
which is just a 2 to the N
dimensional identity matrix.
And then we go through all indices marked,
so if these indices'
marked there's something
that I feed the oracle with.
I will tell it which indices to mark,
and for these indices
they're left to mark,
it will change the element
to the element minus 1.
So it will put a minus 1 on those,
while the others stay plus 1.
So this is then a unitary that I have.
But now it's a matrix.
So what I need to do is I need
to transform this unitary,
the matrix, into an operator on N qubits.
Which I'm doing here, and
then I apply that unitary.
So, this unitary I have now
applied to my quantum circuit
that I created, and I will
return this quantum circuit.
This is just the phase
oracle that I'm creating.
Then, the other thing that I
wanna create is the diffuser,
which you will see in the text
book, they called a diffuser
which I called V.
So we create another quantum circuit.
We call it diffuser or V.
And now the diffuser, what it does,
let's have a look at the
circuit here on this side.
The V matrix consists of
first applying Hadamard gates.
So we apply Hadamard
gates to all N qubits,
then we apply the phase oracle.
For the phase oracle now,
it's basically the oracle
which takes it's marked element
just as your element.
So the element, zero, that I'm giving it
is the element zero, zero,
zero, zero on all N qubits
which is exactly the one that is marked
in our second oracle.
So we apply the oracle
on the zero state
and then we apply another,
again Hadamard gates on all N qubits.
And we return that quantum circuit.
And now we can, that's all we need,
now we can just construct
the Grover algorithm.
We start by creating a
quantum circuit with N qubits
and N classical bits so
that we can again in the end
do a measurement because
of course we wanna find
the elements so we need to
do measurements in the end.
We have our N classical bits as well.
Then we determine R,
the number of rounds
that we want to apply.
Now we have this ugly formula here.
Here we go.
So we have the general
one for arbitrary M here.
We have pi over 4 times arcsin
of square root of M over N.
So, M is the length of the marked elements
'cause the marked elements
I would give an array
which can have any number of elements.
I can just give it this one element
but I can also put in five elements.
But the length of this array will tell me
how many marked elements I have,
and 2 to the power small n is
what gives us the capital N,
the number of possible bit strings,
so that's the square root of M over N,
taking the arcsin, 4 times
pi over that, minus 1/2.
And now of course I'm
probably getting something
that is not, like I'm gonna get 5.3
but I need to have an integer number
so I'm rounding that number.
And I'm also taking integer
because if I'm just rounding it
it's still probably a float, or double,
and then I take the integer so it's...
I can do a for-loop later on that.
So, just to see how many
rounds I'm actually doing
and what my marked elements are,
I'm then just printing the information.
That's not important,
it's just for me to see
what I'm actually doing.
I'm trying to find these marked elements,
and I'm doing it by Grover's
algorithm with R rounds.
Then, in the end, I am applying--
No, sorry not in the end.
Then I start my quantum algorithm
by preparing the state, S,
the superposition state.
So, I just created the
quantum circuit with N qubits
so they all in state zero and
I need to apply Hadamard gate
to all of them to get the state, S.
Once I did that, we have R rounds,
so this just a dummy, which
means I don't need it.
And I need to use whatever index I'm using
to go through all the R rounds.
Just doing R rounds the same thing.
So I have R rounds in
which I append, first,
the phase oracle, where I,
you can see it had N, as I need to get N,
and then I need to get the
indices that I want to mark.
The indices is just the marked elements
which I also have to feed
to my Grover algorithm.
It's the elements that I'm looking for.
So I apply the phase oracle
with N marked and I
apply it to my N qubits.
So I need to specify that
as well if I want to append
that quantum circuit to
my original, up to date,
qc quantum circuit.
After I appended the phase oracle,
I append also the diffuser
so the one that is UF zero H,
and also on my N qubits.
And all of that I do R times.
In the end I do a
measurement on all N qubits
and store it in all N classical bits.
And then I return this quantum circuit.
So now I can run that.
And let's choose, for
example, N equal to 5.
Then X is--
now my marked element, I just
want to choose a random number
between zero, and 2 to the N minus 1.
So, this is how I do that.
I don't need to give it a specific one
but I just choose a random one,
and then I run Grover
and I draw the circuit.
Okay, so now you can see the
information that I've printed
as I have five qubits.
The phaser state that
I've marked is state two.
And this means that I
need to run four rounds
given five qubits and one marked element.
So, we see we have Hadamard, all qubits,
then oracle diffuser,
oracle diffuser, four times.
And then the measurement in the end
and then we will get the outcome.
So this is just as expected.
Now let's see what happens if
we run it on our simulator.
So, we again choose a second,
as we did with the
Deutsch-Jozsa algorithm,
we choose the Hessan simulator
to be our back end before
doing it on an actual device.
Then we...
Then we run the whole thing
by using the execute command,
we execute the quantum circuit
that we just constructed
using our simulator as back end,
and with 10,000 shots.
We'll see in a second why
I'm using 10,000 shots here.
Because then I'm getting
the results and the counts,
and as before I plot the counts.
I plot them, yeah I plot
the counts on a histogram.
So, let's run that and see what happens.
So, what we can see
here is the probability
to get the state, two,
which is the one that we wanted, right?
We saw here that the random element
that was chosen was element two.
And we can see this is
the binary description
of the number, two.
And it's 0.999%.
Oh, sorry, 99.9%.
By the way the reason why I
wanted to have 10,000 shots,
because if I would only
choose a thousand shots
it might have just showed
you that it's probability 1.
I wanted to show you that
it's not probability 1,
and this is not because we have any noise
because we're using the
simulator so there is no noise.
But, the reason that we
have not probability 1
but still a very high probability
is that if I compute the
R without rounding it,
if I just look at the formula
then I would get exactly the
one that I should get before.
I'm getting 3.92
so then this is rounded to 4.
But 3.92 is already quite close to 4,
so in the worst case you
would get something like 3.5
and then we would be as
close as 1 over 2 to the N,
which is the arrow that
we calculated before,
the failure of probability
that we would have in the worst case.
But in our case, as we have
3.92 is already very close to 4
but it's not exactly 4.
If it's exactly 4, we
would get exactly 1 here.
But since it's 3.9 we get
something still very high.
So we can see that success
probably is actually pretty high.
Now we can run the result
as well on our back end.
But actually I'm not sure
that I want to do that
because I already ran it before.
I can run it now.
I'm choosing
that we have to decrease
the number of qubits
so that the noise is not too high.
And we can now see here,
job is being validated.
But I also ran it actually
before to show you the results,
and the results that we got here was that,
and you can see there is quite a noise.
But you can still identify
what the two marked
elements were in this case.
And the case where I ran it before,
we got the two marked
elements, one and two.
So, the binary description of one and two.
While the others also
had some probabilities,
but first of all it's because
probably we had to round R a bit
but then also
because we have noise on
the (indistinct) computer
and we have to apply a
lot of C-not of gates.
Yeah.
Okay, that's actually
all I wanted to show you.
