- How does the bond length get measured?
Well, I mean, one, I had this picture of
Let's do it quickly.
I mean, x-ray diffraction
is a very common tool.
These are images that
you'll see of junctions.
This is an image from
transmission electron microscopy.
And so you can estimate
the lattice constant
and stuff like that
from microscopy techniques like this.
Okay. And is there any metric
that can assist us to identify if
the effective second that
seems to be an up question.
- Well, I assume you mean
not the final answer, right?
You mean something that it
can tell you beforehand.
- That actually a very hard question.
And if someone had the
ability, that would be very,
very effective, but
the short answer is no.
You could have some intuition
based on the structure
of the trial state, whether
that's going to help you get
even close to the ground
seed or not. But in general,
there isn't like an easy
way to identify, okay,
this is gonna give me
a great answer with low
circuit depth and number
of parameters you're asking
for the (indistinct). Okay.
Let's get back into the
final part of the talk.
Okay. So putting it all together,
there were also questions on,
can we see like demonstrations
of how this works?
So let me just take you through an example
of how all of this looks like, right?
So this is an example of
a six cubic Hamiltonian.
We don't need to get into the details
of what the Hamiltonian is
or was, but what this means.
And then this was a depth,
one Sukkot that was used
for the trial state.
And that meant that there were a total
of 30 variational parameters.
Okay. All those different Tita angles
and the single cubic stations.
So that's what you see
in this panel here at every iteration,
along the optimization,
you have all these different
Tita angles being updated,
right? And then you start
out at some initial point,
and then you have these
blue and red points,
which basically represent
in your SPAC optimization.
This is one blue point, and
this is one red point. Okay?
So at every iteration you're
approximating the gradient
using just two measurements.
There were also questions I
saw on choice of optimization.
Do they have to be agreed in
descent? They don't there,
it can be whatever, you know,
it's your favorite optimizer
and whatever works well
for the experiment.
This is an example of an
experimental run where we were SPAC
was particularly useful
because you have to deal
with stochastic noise,
arising from finite
number of shots and so on.
So every iteration you have
one red, one blue point,
you measure a gradient,
you approximate a gradient
using those measurements
and choose a new direction to go
and as I showed a few slides earlier.
And finally you're going
to come and flatten out
at some region, it won't go any lower
and that's going to be your
final energy estimate. Okay?
And then I'll just
re-emphasize once again,
these are all the different Tita angles,
which are being optimized
simultaneously at every iteration.
Okay. So this is an
example of VQE in action.
Let's focus in on specifically
the example of hydrogen.
Okay? And this is considering
just the one S orbitals.
Okay. So for the two
atoms in the molecule,
each two you have two, one S orbitals,
but they're spin degenerate.
So that gives you a total
of four spin orbitals.
However you can identify some symmetry
so that Hamiltonian and get rid of two.
And that reduces the problem
to, just to two cubic problem.
And then you essentially work
with Hamiltonians of this form
after some choice of
mapping. And I believe
this was a paradigm mapping
that was employed for this example.
And then you have all these
different coefficients. Okay.
And this was stuff that
Antonio explained yesterday,
how these coefficients can
be calculated efficiently,
classically. And these
are basically functions
of the inter atomic distance, right?
So as I go from an equilibrium
distance of 0.73 Angstrom,
to four Angstrom the poly
strings are the same.
But all that's changed
is these coefficients,
right? So for every inter-atomic distance,
you have a different Hamiltonian
that you have to optimize.
So this is also an example
of experimental data.
This is running at four Angstrom
Hamiltonian here in blue.
And then in red, this is for
the equilibrium Hamiltonian.
And each of these is going
to converge to some energy.
And that's gonna be the final energy.
I'll just also emphasize
here that this Hamiltonian
is the electronic Hamiltonian
after the bond Oppenheimer approximation.
So this is only giving you the
electronic energy on top of
this, you have to add on
some fixed nuclear energy
that is also efficient
to evaluate classically.
All right? So then basically
by doing this optimization
for a bunch of different
inter-atomic spacings,
you can generate curves like this.
So the the red dotted line is basically
what the exact energies are.
And this is also one of
the exercises you have.
How do you estimate the exact energy
when you're giving a Hamiltonian lengths?
Essentially all of these
are basically poly matrices.
So you have a matrix
representation of the Hamiltonian.
You can do an exact diagonalizaion
and find its Eigenvalues.
The lowest energy Eigenvalues
is the ground state.
So that can be done to generate
that the red curve and these black circles
basically represent the
experimental data points.
What came out from the
experiment and at this experiment
seemed to work quite well.
And this is very promising,
although it's a very,
very simple problem .Now
trying to go to the next step,
what happens like going from hydrogen,
just to lithium hydrate, okay.
Which is now a four cubit problem.
Things grow pretty quickly in
the number of poly strings.
Okay. What was just five poly
terms or five poly strings now
becomes about a hundred poly strings,
which can mean many more
circuits to run with many more
poll stations or per energy evaluation.
In this case, you know,
the I, I, Z, I, I, Z, Z, Z,
for the reasons I mentioned before,
these as all can be grouped into one set.
So they only require one
set of post rotations.
And then the XX is its own set.
So you have only two sets
and here you can reduce
for the lithium hydrate.
You can reduce the number of
measurements a bit by grouping
these hundred times of the 25 sets.
Okay. This is for our
chosen representation
in a four cubit problem.
But that you can then try
and run the optimization
for lithium hydrate as well.
And there's a lot going on here.
So let me walk through
this very, very slowly.
Okay. So on the left is
the example of hydrogen.
And this was an experiment run back then ,
on a seven cubit device.
So if you could connect this
back to what I was talking,
how do you choose if you
have a subset to run,
how do you choose which,
which pair do you choose
for the experiment
or for your hardware run?
You basically look at all
the information you have
about the circuit parameter about,
the quality of the coherence
times and the Gators and so on.
Okay. So the hydrogen example
as we saw, worked quite well.
The lithium hydride seems
to deviate quite a lot
from this dotted line,
which is what the
correct answer should be.
Right? And, you know,
particularly interesting
are our points like this
points here, pretty big error.
That bump is pretty
interesting right there.
Okay. And this comes
back to this question,
what's the point of
running and you have this,
why not just wait for full tolerance?
It's because this gives you deep insights
into what is limiting us,
understanding the source of error
when you're addressing this,
a very standard problem
with a completely brand new
paradigm of computing is very instructive.
So these are two cubit, four cubits,
six cubic Hamiltonians.
These are matrices that one
can trivially diagonalize
on a laptop, right?
Knowing the answer is very
easy for these problems.
And the point of running these
on noisy quantum devices,
is to understand better.
What are the limitations? All right?
So what do you see in
these shaded circles here,
These are basically simulations
of the noisy experiment.
So this is also something you
can do in crisscut where you
have access to noise simulators.
Okay. You have different
models you can include such as,
models that account for T one
and T two errors and so on.
Okay. So one thing you can see
from these numerical simulations is okay.
So by the way, these,
so the color density plots
represent a histogram of outcomes
from a hundred different simulation runs.
So the blue was just one
experiment on hardware.
Sorry, the black circle was
just one experiment on hardware.
And then all of these different
shaded regions represent
hundred different numerical simulations,
of the noisy experiment. Okay?
So, and it captures some
of the features quite well.
You know, this, it captures
this difference here.
It captures this bump between
two and three angstrom.
And so we can really begin to,
identify what are some
of the limitations. Okay?
And you can see that
even, even in simulation,
you also have this, almost
a spread in results.
So, which means that the
classical optimization
isn't getting you to the same
point, even in simulation,
every time. It can take
you to different points.
So, in fact if you see
the curve on the right,
you have almost a bimodal
distribution, you have one,
you have something here and
you have something peaked
around there and you
held over distribution
of results there, right?
And this is on the top.
The top distribution is
clearly worse than the bottom,
but it basically is
indicated that your classical
optimization is getting
stuck in a local minimum.
Okay. So this is stuff
you can see. All right.
And then another nice thing
to kind of also focus in on is
even the the simulations
of the noisy experiment.
They seem to have this offset almost
from the dotted line.
I mean, the hardware experiment
itself has the it's errors,
but even the simulations have this offset.
And this is also a
consequence of de coherence
of T one and T two errors.
And these are all kinds of
things that I strongly encourage
you to play with.
The tools are all there in cascade
to simulate different noise
models and see how the different
strands of noise affect the accuracy
of these kinds of computations.
Connecting back to the question
on how much circuit
depth is enough, right?
So one big reason for why
this point is so far off
is at that point,
you actually need much more
circuit depth in your hardware
efficient trial state to be
able to access this energy.
So what all of these
different experiments used
were only depth one. So this
was only one entangling block.
So you needed more
entangling blocks. Okay.
And this reasons the question then well,
why didn't we go for
more entangling blocks?
Why didn't we go for more circuit depths.
And this is a slightly researchy slide,
but I hope I can walk
you through this as well.
This basically showing
you the area and energy.
As a function of noise strength.
So assume that the noise was very low.
If the noise was very low,
you could basically keep
increasing your circuit depth,
very nicely. And then you
could get extremely low errors,
errors lower than 10 to the minus four.
And this is by entangles eight.
This basically means a depth of eight.
So when the noise is extremely weak,
you can go to much longer circuit depths
and have better approximations
to the ground state.
However when the noise is very large,
you don't have the benefit
of going too much long
to the long circuit debts.
The long circuit depth is actually worse
than a trivial circuit depth.
And this is the interplay
to be able to benefit
from more circuit depth,
you need your noise to be much smaller,
but if your noise is very large,
you're better off operating
at limited circuit depth.
Okay. So understanding all of this
is an extremely important game in this,
this ear of noisy quantum devices.
All right. Okay. Maybe I can
pause and see if there are
any questions related to
what we've discussed so far.
- Yeah, can you hear me now?
Alright that's right (indistinct)
- So I see a question on
how is lithium hydride
of four cubit problem.
That's a good question.
So you have the one S for the hydrogen,
and then you have one S and
two S and two P orbitals
for lithium and you also have a plethora
of higher orbitals but you can,
basically restrict yourself
to how far you want to go.
And another way to that
I briefly mentioned
on how you want to reduce
the number of cubits.
So one is okay.
First you can reduce say two cubits
based on some symmetries
in the Hamiltonian.
And then after that, you can freeze in
some of the core orbitals.
So for instance, the one S of the lithium,
because that isn't very
strongly interacting with,
with the other orbitals, and
this is an approximation,
but for simplicity, we've
made use of freezing
of core orbitals and then making use
of some symmetries to reduce
that problem to one with
just four cubits.
- Awesome. Thank you.
(indistinct)Are you able to hear me now?
- How can we measure the
polystrings in groups
and IBM machines? Does that
mean we perform gate operations?
Not after measurements, you
measure those operations
after the state preparation. Okay.
So if the circuit
that you use for the state preparation,
you'd then have a layer of post-rotations
that you have to apply depending
on what that police ring is.
Okay. What defines number of cubits
required to assimilate any molecule?
Yeah, that's an excellent question.
I think I briefly mentioned this.
So it relates to the number
of orbitals that you consider
in that molecular simulation.
Okay. So in general, if
you had N spin orbitals,
you would need N cubits.
So in the case of the hydrogen,
for instance, be considered
four spin orbitals
and in general that
would mean four cubits,
but then you can get rid of
at least two by identifying
some properties at the
time of the (indistinct)
but then you can also
make more approximations,
like what it was referring
to is freezing of course,
of course electronic orbitals and so on.
How many cubits are required
to simulate lithium sulfide?
This is a promising material
for systems of back P,
yes there is actually some
work that you can find
on the archive and what
you might find interesting.
Once again it really depends
on the number of orbitals
that you want to consider
and the amount of approximation
you make there. Okay.
Yes. The lithium hydrate
bump is indicative
that the state that you prepared
with that limited circuit depth
is not capturing all aspects
of that current state of
the exact current state.
Okay. I think that's the
hopefully good for now.
Let me talk about the
final part of my talk,
which I will start with a short story
of the longitude problem.
And perhaps many of you
aren't familiar with this,
but this was one of the most
iconic engineering problems
of the 18th century,
which was basically all of
business was through ships
or big business, you know, trade.
And these ships needed
to know where they were,
they needed to know
their geographic location based
on longitude and latitude.
So longitudes are the lines that run
from the North pole to
the South pole, right?
And the way they did that
was they would have a reference clock.
So say they were leaving
England, these ships,
they had a clock that, that
was time to England time.
And then when they were at sea,
they would figure out what
the local noon time was
from say the position of the sun,
and then compare that to
the reference time they had,
when they left shore from England
and using that time difference,
they would try and evaluate
what the longitude was. Okay.
And this was extremely important
because all these reference clocks failed.
And, and why was that?
Because there were big mechanical objects,
like what you have in this figure,
which were extremely sensitive
to perturbations from
the environment. Okay.
They were sensitive to
changes in humidity changes
in the rocking of the ship
changes in temperature and so on.
And this was a big,
big challenge because the
imprecise knowledge of the
longitude led to shipwrecks
failed business and so on.
And this was a huge
problem with a great amount
of prize money, even back in the day.
So this was the first iteration
of this clock that this
great inventor Harrison made
in 1730. And you can see,
it has a whole bunch of different
mechanical moving parts.
And they're actually, some funny solutions
that people tried to come up with
that included barking dogs,
look this up if you're ever interested.
So he worked through
many different iterations
over a few decades. This
was the second iteration.
And then his taught iteration.
And in the study iteration,
in 2003, he actually,
came up with a device
that would offer robustness
against temperature fluctuations.
And that basically led to the invention
of the bio-metallic strip for the pursuit
of this big, large problem,
the longitude problem led
to other technologies. Okay.
And finally, by understanding,
all the different error mechanisms
that are affecting the performance of
this clock that can work at sea,
he finally came up with
this pocket clock. Okay.
The Marine chronometer okay.
And this was hugely
transformational. And this,
was really a journey
that over many decades,
that finally turned out because
of an understanding of the
many different errors and
figuring out methods to fix them.
Okay. And I find that
the story really connects
quite well to what we're doing
with quantum computing today.
And I must credit this
to professor Simon Benjamin from Oxford.
And I've kind of heard of
the story down the grapevine,
but the analogy works really well
of trying to understand
the errors or noisy devices
and maybe then coming up with
tricks to try and fix them.
Okay. And for that let me
get into the final part
of my talk that's error mitigation.
And the idea here is basically that
we saw that these different
expectation values like the
energy for instance, is affected by noise.
It's affected by say a T one and T two.
And so we can represent that
noise by some parameter Lambda.
Okay. And for a reasonably
small noise parameter Lambda,
you can essentially express
this expectation of value
as a Taylor series in Lambda,
around its zero noise limit. Okay.
So we are gonna call
this the zero nice limit.
And the question we're asking
is you have a noisy device
using noisy measurements,
is it possible for us to extract
what the result would have
been in the absence of noise? Okay.
I mean, in the context of our
regular day-to-day devices,
this isn't a completely
unfamiliar context, right?
Concept, I mean, noise, canceling
headphones, for instance,
make use of noise in a
slightly different way.
But the idea that one could use noise
to nullify its effect is not
that (indistinct) idea, right.
Specifically in this context,
assume that I had as an experiment,
just to have the ability
to control my T one
and T two precisely. So what
was a hundred microseconds?
I've been able to make it 50 microseconds.
And so that means for
instance, I've been able,
to scale the strength of the
noise by some factor C okay.
So here expressing this only
to second order in Lambda,
you have this E starter,
which is the noise freedom,
which is what we want to access.
And then you have this
linear term in Lambda,
and then you have higher order
terms when you amplify the
noise strength by a factor
of C you once again,
have the C star. Now you've
gotten this, the C Lambda here.
And then again, you
have higher order comes.
Now by combining these two
noisy measurements, Okay,
using a linear extrapolation like this.
What you can see is that
you now have an estimate
where the leading order noise
term is order Lambda square.
Okay. So by two wrongs,
you've been able to make
something that's more right. Okay.
You've been able to have
a closer access to E star.
And in general,
this idea is called
Richardson extrapolation
or zero noise extrapolation
with N noisy measurements,
you can suppress the highest
order of noise term to
order Lambda at the end.
The question though is how
do you accurately scale the
strength of the noise? Okay.
This is not a trivial task.
You can't open up Criscut
or go up to the quantum
experience and say,
let me change the T one
and T two T two times.
That's not possible.
So you can play one more trick
there if you had something
like pulse level control. Okay.
So say you had a hundred nanosecond
or a 10 nanosecond pipe pulse.
And say that's represented by this.
Assuming that the noise
does not change over time,
let's stretch that out.
That same pipe pulse to
20 nanosecond pipe pulse.
And not only for just that
pulse for every other pulse
in your circuit, let's
switch that by that factor.
Then the result, the expectation value,
the mean value that you measure
after this circuit is going
to have a noise that's too exemplified.
So basically what I showed
in the previous slide,
you have one noisy estimate.
You amplify the noise
that gets even worse,
and you can extrapolate down
to what that noise would have been.
What the expectation would have been
in the absence of noise.
Okay. To some order.
And so this can be a
very powerful technique
to extract meaningful computations,
even from noisy hardware.
So this is a really nice
visualization of experimental data,
by the way, okay. This,
is experimental data
where you try and operate a projectory.
You try and operate a circuit that takes
your cubit along the trajectory,
along the surface of the box
sphere from the ground state
to the excited state.
Right? And you want this to
be along the blocks shere,
but because of noise, this
doesn't quite get up to the one.
And you also see it
compressed towards the center.
That's also because of noise.
Now, if you amplify the
noise by a factor of two,
which is what I'm doing here,
then you can see that
you're even further away.
You only get, about midway
with this experiment.
But by combining the red and degree,
you can generate what the
trajectory would have been
in the absence of noise. And
this now maps beautifully.
What the trajectory would have been,
that's going right along the blocks.
You're approaching that one.
Okay. This is, this is an example of,
error mitigation on one cubit,
but then you can also then
revisit some of the problems
I spoke about before
our short break, right?
The lithium hydride
simulation, for instance,
what was preventing us from
going to larger depths,
it was that we were affected
by the effect of noise
on the circuit. Right?
But now we seem to have a trick
up our sleeve that you know,
that can nullify the effect
of this noise to some extent,
right? So that lets us go to
a much longer circuit depth.
Okay? So this represent the
raw experimental points,
which are measured at four
different noise factors.
Okay? And if you compare this to this,
these raw points are actually even worse.
Why are they worse?
Because the depth of the
second is longer. Okay.
But because you have these
additional noisy measurements,
you were able to extrapolate
down to what the result
would have been in the
absence of noise. Okay.
And now you see these white
curves line up much better
with the green exact energies,
but this is an important
flavor of techniques
and experiments that can be useful
for performing computations
on even noisy hardware.
I'll just emphasize that these
are really only benefiting
mean values, expectations values.
This isn't something you can do
for single shot measurements,
which are relevant for
some other algorithms,
like face estimation that you
might have learned in the past.
But for problems like this,
where you rely on expectation values,
and this is not a small set of problems
that people are interested in.
You know, you might be interested
in simulating energies of
molecular systems of systems
and condensed matter physics,
you might be interested in,
estimate simulating the magnetization
of some complex frustrated magnet.
All of these are examples
of problems that rely on
expectation values in machine learning,
people talk of kernel
methods that also translates
into an expectation of value.
So there are a whole class of
problems that that can benefit
from techniques like this.
So with that I be,
just finally summarize
what we've covered today.
I started out with a brief hardware recap
and then discussed what kinds
of noise one has to deal
with in our hardware and
the motivation behind this
was to connect this to the performance
of algorithms that are geared
towards noisy hardware.
Like the variational quantum Eigensolver.
And we looked at this in great detail
in the context of the lithium hydride,
in the context of molecular simulation,
lithium hydride being one example.
And then we saw how noise
and the hardware affects
the performance of the
molecular simulation.
In practice you have
these very large errors,
but understanding those errors
is an important stepping
stone in the long road
to fall tolerance and error correction.
But in the meantime, this whole approach
can help us devise techniques and tricks
like zero noise extrapolation,
which can enable access
to accurate computations,
certain accurate computations,
even on noisy devices.
Okay. And over the last two days,
while a lot of this has
been focused on chemistry,
the techniques in general
have very wide applicability.
You can address problems
in nuclear physics
in condensed matter physics magnetism,
even beyond quantum
simulation, broadcasting,
machine learning and finance and so on.
It really starts out
with having to map out
that problem onto cubits.
So once you have a cubit Hamiltonian,
you can play this whole game
of trying to find its ground
state energy using this
various little approach. Okay?
With that, I think I want to
thank you for your attention.
I hope this was useful for all of you.
And I hope this also motivates you
to try and run experiments
on the hardware,
run experiments with noise simulators,
get a sense of the limitations,
try and develop tricks
to kind of mitigate the
effect of these errors
and develop a general intuition
for what it takes to run experiments
on actual hardware. Okay?
