- Oh, okay, thanks.
Okay, yes.
Hi everybody, so I will
start with the introduction.
I'm Dmitri Maslov.
I work at IBM as a research staff member.
My other affiliation or
another name for my job
is the chief software architect.
I work primarily in the area
of quantum circuit
synthesis and optimization.
This is something that I have been doing
for maybe 17 to 18 years,
something like that.
Meaning, before programmable
quantum computers,
the prototype computers were available
and they have been
available since circa 2016,
which changed the landscape
quite significantly
because now we have machines
that we can play around
with, program them,
and see what it is that they can compute.
That's, I guess, all for the
introduction that I have,
and perhaps let us now go to the slides.
All right, perfect.
So I'm going to talk about quantum circuit
synthesis and optimization today.
And the scope of the talk is as follows.
Firstly, I would like to address
the minimal background
that is important to have
in order to be able to assess this talk.
I would like the audience to
be familiar with the basics
of quantum computation,
including the data structure,
knowing that the computation is done
by applying unitary transformations
to the the data structure,
which is given by the state factors.
And the result is obtained
through the measurement.
Then since this talk
concerns quantum circuits,
I would like the audience to be familiar
with the concepts of quantum
circuits, basic concepts,
including gates and
the parameters of value
in quantum circuits, such
as the number of gates,
the depth, the width of
the circuit, and so forth.
And maybe some basic
knowledge of algorithms
to put it in the scope,
because the circuits,
they should implement quantum algorithms,
useful quantum algorithms.
What my talk cover is
a selection of topics
relating to how to implement
quantum algorithm sufficiently.
Specifically, suppose you want to run
a certain computation
on a quantum computer,
how would you go about creating a circuit
that implements it?
And along with it,
I will just describe the mathematics
and the algorithms that
lead to systematically
construct efficient quantum circuits.
And the target audience of this talk
is anyone interested in computing
using a quantum computer,
including graduate students
and even quantum computing experts,
because hopefully we'll present
some of the material from an angle
that some quantum computing
experts can benefit from.
All right, so let's start with the gates.
In classical computation,
we have only four gates.
And they're very simple gates.
There is a single-bit gate, the NOT gate,
then there are three, two
input, single output gates,
Boolean AND, Boolean OR, and Boolean XOR.
And between these four gates,
we can use them to implement
any classical computation whatsoever.
We don't need anything else
from the logic point of view.
However, in quantum computation,
when you look at quantum computers,
you may be overwhelmed
by the number of quantum gates there is.
For instance, there is
Pauli-X, Y, Z gates,
already three right there,
then there is Rx, Ry, Rz,
with real, valid parameters.
So, and those are three
times infinitely many.
Then there is the phase gate,
the square root of NOT gate, V gate,
there is the G gates, there is the pi/6.
There are Rz,
pi divided by two to the
n where n is any integer.
Those are the gates used in,
to construct the quantum
Fourier transform.
And these are just the single-qubit gates
of which we only had one
in classical computing.
And then there are two-qubit gates
of which there are plenty,
and three-qubit gates,
including Toffoli, Fredkin,
Peres, and so forth.
So the number gets quickly overwhelming.
However, it is not.
So let me explain how to
construct the gate libraries
and hopefully shine some
intuition behind them.
We start with the NOT gate.
The NOT gate is a quantum gate
that literally inverts
the value of the variable.
It is analogous to the classical NOT gate,
if not to say it's identical
in its workings to the classical gate.
So let's next introduce a gate
that exists only quantumly,
but not classically.
That would be the Z
gate, the Pauli-Z gate.
So if you prepare a state
in the computational basis
and you apply the Z gate and
you measure it immediately,
you would think that there
was no computation done
because you would not be able
to detect what has happened.
You will get the identity out.
However, this gate does
not implement the identity.
What it does is it creates a phase,
and it is, in quantum computation,
it's the ability to compute
with phases that separates
quantum computation from
classical computation.
This is what gives you the advantage
over classical computation.
So next gate I would like to
introduce is the Hadamard gate.
And the purpose this Hadamard gate serves
is that of transforming
the X gate into the Z gate,
by conjugation on the both sides,
and the other way around,
transforming the X gate into the Z gate.
It is a very, very important gate,
and I will use it a lot
in this presentation.
As far as the rest of
the gates are concerned,
we can obtain all of them
by applying the square root operation.
So given a single-qubit unitary U,
we can apply the square root of it,
the square root to it,
and obtain an U operation.
For instance, the V gate
is the square root of X.
A P gate, the phase gate
is the square root of Z.
And because of that,
we can transform the
phase gate into the V gate
and the V gate into the phase gate
by the Hadamard gate, because
what Hadamard gate does,
it transforms the X
bases into the Z bases.
Again, very useful gate.
A T gate, for instance,
the square root of P gate,
meaning it's the fourth root of Z gate,
or quote-unquote, eighth
root of the identity.
All right, suppose we're done
with the single-qubit gates.
The way to obtain eight
two-qubit gate is to create
a block diagonal matrix with
the entries identity and U,
where U is the single-qubit unitary.
For instance, this way,
we can create the CNOT
gate out of the NOT gate.
And we can create the Toffoli gate.
the Toffoli gate is important
because if we consider quantum
single in two-qubit gates,
then in terms of the Boolean logic,
we have only two gates that are Boolean.
The NOT and the CNOT.
And using these gates,
we're only able to compute
linear Boolean functions.
And this is not enough
for the purpose of classical computation.
So we would like to be able
to have a complete library,
and the way to have the complete library
is to add a Toffoli gate.
And the Toffoli gate
is a gate that computes
the product of two variables
and XORs it into the third variable.
It is important to note
that the Toffoli gate itself
is not an elementary gate,
it's a composite gate.
So whenever you talk
about quantum computation,
it is never the case, as far as I know,
that the Toffoli gate
is implemented directly.
It is implemented as a surrogate
with the elementary gates.
And the elementary gates are
single and two-qubit gates.
So let's talk about the libraries.
What constitutes the libraries?
First, in this slide here is the CNOT,
the circuits with the CNOT gates,
linear reversible circuits.
The second is a very important library
that consists of CNOT and
single-qubit rotations,
such as Rx, Ry, Rz,
and sometimes, the roots of the CNOT.
This is, these are,
this is the library of the gates
that are used for physical implementation
of quantum circuits on a
current existing near term
quantum computers, such as for instance,
the superconducting circuits, QIP,
quantum information processing,
the IBM's quantum computer,
or a trapped ion quantum computer.
And these are the examples of
prototype quantum computers
that exist today, that
you can actually program.
Using the Hadamard phase and CNOT,
we obtained the Clifford
library or Clifford circuits.
And it is an important subset
of the set of full unitaries.
Next, with T and CNOT,
we get modified linear reversible circuits
that play an important role,
and I will talk about them
later on in this talk.
If you consider a single-qubit
Hadamard and T gates,
you get arbitrary single-qubit
unitary by approximation.
And finally, if you add
the CNOT gate to H and T,
you get universal
fault-tolerant computers.
So there is a difference
between the physical level
computers and fault-tolerant computers.
So there are two important libraries.
One is a physical level library,
and the other is a universal
fault-tolerant library.
The fault-tolerant library is used
for fault-tolerant computations
and the fault-tolerance,
it imposes the limits
on the kinds of gates
that can be used, which for example,
can give the library Clifford plus T,
but there may be other libraries.
But Clifford plus T is very popular.
Let's talk about the controls.
So in quantum computation,
controls can often be thought
of as classical controls.
There is the same intuition.
However, this intuition has flaws.
For instance, if you want to apply
a controlled unitary to a qubit
that resides in the eigenstate of U,
then this gives a phase
on the controlling qubit.
And this phase,
even though immediately undetectable,
can later turn into a bit flip,
if for instance, affected
by the Hadamard gate.
So therefore, there is a good reason
to think of controls
as classical controls,
but you have to be careful.
So another example where the intuition
that controls cannot be fully interpreted
as classical controls
is the circuit identity
that I have on this slide,
where a controlled-Z equals Z-controlled.
So this is indeed the case
if you consider the
matrices on the bottom.
The one on the left hand
side implements controlled-U.
The one on the right hand
side implements U-controlled.
And you can see that the two are equal
if and only if U 1,1 equals one,
U 1,2 and U 2,1 are are equal to zero,
but U2,2 is anything,
so long as, of course,
the matrix is unitary.
Meaning it can be arbitrary
complex number of length one,
of which there are infinitely many.
And negative one is an
example of such a number.
The negative one gives the Z gate.
Okay, let's talk more about the controls,
but this time, let's start
with something simple
that we know from Boolean logic.
In particular, this
expression from Boolean logic
is very popular in Boolean
logic optimization.
Where we can, for instance, take x-bar
and replace it with x exclusive OR one,
or the other way round, or
maybe if we have x plus x-bar,
we replace it with a constant one.
The way we can write this
circuit identity in quantum
is we use the top qubit and
say that it is the variable x
and the bottom variable is the,
a qubit that resides in the value of zero.
And then what happens
is, these three gates,
they compute x-bar, this gate computes x,
and this gate computes the constant one.
So that's all that this circuit does.
It computes the identity
and it computes the identity
based on a very well-known
classical logic rule.
Using this circuit, we can add a control.
And this circuit that you
see on this slide right now,
it also computes the identity.
We can even add more controls like this,
and we still compute the identity.
So let's take a look at
this circuit a little more.
Let's create a circuit identity
where this circuit that we
know computes the identity,
and then the one on the right hand side
is the doubly-controlled-X squared.
By the very definition of
what a controlled gate is
and what the X gate is,
which X is the square root
of NOT in the X-direction.
The controlled-controlled-X
squared is, well,
the controlled-controlled
identity, which is the identity.
So we have identity on the left hand side,
we have the identity
on the right hand side.
The next thing we do is we take
the square root of the bottom variable.
And when we do that, the X
turns into the square root of X,
the X squared turns into X,
and we have this circuit identity,
so we obtain the Toffoli.
Well, this is not the correct
way of obtaining the Toffoli,
but it is almost correct.
And perhaps it illustrates
how to really obtain the Toffoli gate
using the physical level near
term computer's gate library.
So specifically, to obtain it formally,
we use the mixed arithmetic expression
that expresses the product
of two variables, a and b,
as the linear sum of a divided by two,
b divided by two, negative a
exclusive OR b divided by two.
So the division by two here
means that rather than the negation,
we have the root of the negation.
The exclusive OR, of course,
is the operation that
applies to Boolean numbers.
It is a Boolean operation.
That's why this expression is
a mixed arithmetic expression.
So the rest are operations over,
let's call it real numbers,
or maybe integers, If we
multiply everything by two
and have expression two times a b
equals a plus b minus a exclusive OR b.
So using this expression,
we can actually construct
this very circuit
that implements the Toffoli gate.
Indeed, to construct this
term, a divided by two,
suppose that our controls are a and b,
what we do is we apply this
gate, the controlled-V.
So what it does is it applies
the square root of NOT if a equals one.
That is, it applies a divided
by two, half the root,
half the NOT in case if a equals one.
Then b divided by two is
implemented by this gate.
And finally, negative
a plus b divided by two
is implemented by the
set of these three gates.
Why?
Because the control of
it is a exclusive OR b.
So a exclusive OR b.
The dagger here tells us
that the sign is negative,
and divided by two means V,
because V is the square root.
So this way, we get the Toffoli gate.
It is important to note
and to stress at this point
that the depth of this
implementation is five.
And it has recently, I believe
maybe five years ago, ish,
has been shown that this depth is minimal.
It was proved either mathematically
or by a computer search,
and I don't exactly remember how.
But it is important to know
that this has been
proved and depth is five.
Let's just remember this number for now.
So next, let's consider how to construct
the Clifford plus T implementation
of the Toffoli gate.
Clifford plus T, recall, is a library
that we use for fault-tolerant
quantum computations.
And it's a different library,
so we will have a different circuit.
But the question is,
is the intuition and the
mathematics that we use
to construct an implementation
of the Toffoli gate
in the fault-tolerant library different
from the one we just used
in order to construct
the implementation of the Toffoli
for physical level computations?
I claim it isn't, I claim it
is the same exact intuition.
And here is why.
In order to construct the Toffoli gate,
we first use the magical Hadamard gates
to map the X into the Z.
Now, what we need to
implement and what we focus on
is the controlled-controlled-Z.
And the controlled-controlled-Z,
it applies the transformation
that multiplies
the cat by negative one to
the power a times b times c.
And remember how previously we used
a mixed arithmetic expression
for a times b times c?
We will do it again.
Before we do it, however,
let me say that if you consider
the CNOT plus T circuits,
and these are the gates that we will use
to implement the
controlled-controlled-Z operation,
then a circuit with
CNOT and T gates can be
described as follows.
So firstly, it computes a
linear reversible function G.
And secondly, as I write down
this linear reversible function,
it computes a set of phases
as described by w to the power t,
where w is the eighth root of unity,
it's a complex valid number,
and t is a mixed arithmetic expression.
Again, notice this mixed arithmetic
that I'm trying to stress.
And the mixed arithmetic
is, we take linear functions
and we multiply them by the constants ai,
and those constants ai belong
to their modulo eight numbers.
All right, let us now construct
the CNOT plus T circuit
that implements controlled-controlled-Z.
The way we do it is we know
that using the CNOT gates,
we can obtain a Boolean
variables, a, b, c,
exclusive OR of a and b,
exclusive OR of b and c,
exclusive OR of a and c,
and exclusive OR of a, b, c.
So these are the things we can
compute using the CNOT gates.
And using the T gates, we can
apply the T to variable a,
we can apply T to variable b,
to variable c, and so forth.
So this is what we can compute
using CNOT and T circuits.
What we can do is we can
notice that four times a b c
can be expressed, this is
just a decomposition of,
the same decomposition, mixed
arithmetic decomposition,
but I used it for three variables
rather than two variables.
And you can see that it is indeed true.
So for instance, let's verify.
Let's take, let's take
some row in the table
and convince ourselves that
taking the linear combination
of T of a with a positive sign,
T of b with a positive sign,
T of c with the positive sign,
then T of a plus b with a negative sign,
T of b plus c with a negative sign,
T of a plus c with a negative sign,
and T of a plus b plus
c with a positive sign,
indeed gives us the four times a b c.
In other words, it gives us this column.
And the reason why we want
to have four times a b c
is because T is the eighth root of unity.
So to get to the square root of unity,
we have to take the eighth
root of unity four times.
That's why it's four.
All right, so let's take
a look at a random row.
Say this one.
So we take T of a with a
positive sign, so it's plus one,
we take T of c with a
positive sign, plus one,
and then we take T of a
negative b with a negative sign,
so it's minus one, and
here we have minus one.
So one plus one negative one
negative one is, well, zero.
So yeah, we got it correct.
And you can verify that
this holds for every row.
In other words, we showed
that we can construct
controlled-controlled-Z using
the T and the CNOT gates.
So we convince ourselves
that we can construct the Toffoli gate.
Let's make the construction
of the Toffoli gate a bit more complex.
So specifically, let's
recall that the T gates
are the most expensive resource
in fault-tolerant quantum computations,
and therefore, we may choose to focus
on the optimization of the T-depth.
And in fact, it so happens
that for CNOT and T circuits,
it is possible to construct
a depth-optimal circuit every time.
And the algorithm that
accomplishes this is a Matroid,
is motivated by Matroid
partitioning algorithm.
So specifically, the way it works is,
given a set of buckets where a bucket
is a certain time slice
in a quantum circuit
where we can apply the T gates,
because we prepared the linear functions
that we can apply the T gates to.
So given this set of buckets,
we will apply the following algorithm
that it ends a greedy algorithm
to lay out the CNOT plus T circuit
in an optimal number of T gate stages.
So since it's a greedy algorithm,
suppose we have an optimal layout
for some k linear functions,
and we want to place another one.
To do that, we construct
the following graph.
The graph has two types of vertices.
Round vertices and square vertices.
The round vertices contain functions,
linear functions of variables
that we want to execute
and schedule somehow in the circuit.
The square boxes contain sets of functions
already partitioned.
We originally start with empty boxes
and we fill them in with functions.
And we try to pack them as
densely as we possibly can.
And the Matroid partitioning algorithm
guarantees that we can
actually pack them optimally.
These are the vertices.
Let's discuss the edges.
We draw a directed edge
from a round vertex
to a round vertex if g,
if h is contained in the partition P
and g can be substituted,
and can be substituted with g in it.
And we draw a directed
edge from a round vertex
to a square vertex if the variable,
if the linear function in the round vertex
can be added to the given partition P.
And if we explore a directed path
from a linear function
that we want to place
to a square bucket, then
this gives us the schedule
of transformations that we need to perform
in order to place this linear
function in the bucket.
So let's illustrate this algorithm
in the case of the Toffoli gate.
And in the case of the Toffoli gate,
we need to compute the
following seven phases.
And suppose we work with three qubits.
So every, every box
here has space for three
linear functions and no
more space than that.
And obviously, we need three
boxes because seven things,
if a box can contain only
three, cannot be placed in two,
because the content of two boxes
is limited by six functions,
and we have seven that we need to place.
So we need three boxes.
And it turns out that we can
place them all in three boxes.
So let's do it, let's apply the algorithm.
So first we want to place the x.
So we just put it in the box.
We want to place the y, y is
linearly independent with x,
we place it in the box.
Continue, we want to place x plus y,
we can not place it in
the box with x and y
because it is linearly
dependent with x and y,
so we have to place it in a different box.
All right, let's put
it in a different box.
Continue, the z we can
place in the first box,
then x plus z, we can,
exclusive OR z, of course,
we can add in the second box.
Now, y plus z.
And this is where the algorithm,
the Matroid partitioning
algorithm shows itself.
And this is where we will actually use it.
But we'll use it to place this function.
Because it actually happens
that you can execute
the first six functions in depth two.
And Matroid partitioning
algorithm shows how to do that.
So to do that, we need
to construct the graph.
So we have all other
functions here in the graph,
the ones that we already placed.
We make round nodes out of
them, and we draw the edges.
So we can not draw the
edge from y plus z to x
because x here is in this partition.
And we can not replace
x with y exclusive OR z,
because this leads to linear dependence.
So this edge, we can not draw.
But we can draw edge to
y, we can draw edge to z,
to x plus y, and x plus z.
So now, let's continue
expanding and drawing more edges
from those vertices, y,
z, x plus y, and x plus z.
So for instance, from y,
we can draw an edge to this square box.
Indeed, y can be placed in this box,
or it can be placed right here, y.
All right, so at this point,
we found a path that we can explore.
So let us now execute it.
The first thing is taking y
and replacing it with y plus z.
So y is here.
Let's replace it.
Okay, we replaced it.
Then the next edge of this graph,
this edge, tells us that
we have to add y here.
So let's add it here.
Okay, so we've done that.
And finally, this means
we place this circuit.
Finally, we need to place x plus y plus z.
We place it in this box.
And this gives us the schedule
of the application of the T gates.
And everything that
happens between the boxes
is computable by the CNOT gates.
So for instance, we can take this circuit.
It prepares the right linear combinations
of variables x, y, and z,
such that the corresponding
T gates can be executed.
All right, so we can
implement Toffoli like that.
It has minimal depth, however,
the number of CNOTs appears to be high.
There is a better way to do it.
That does not guarantee, well,
obviously it's not T-depth optimal,
but it is a very nice circuit
because it uses the minimal
number of the CNOT gates.
And it has very many
interesting properties.
For instance, we can take these five gates
that I just highlighted and
replace them with controlled-P.
Now, we, in order to
implement the Toffoli gate,
we need only five entangling gates,
well, we know that five is minimum.
The other thing we can do
is we can take a look
at this transformation.
So what does it do?
The linear function that it
computes is the identity.
And the reason is this gate
cancels with this gate,
and this gate cancels
whatever this gate does.
So the linear function is the identity.
So everything that this circuit does
is a certain schedule of T gates
or phases that it applies to various cats.
And in fact, it can be
rewritten as this circuit.
So this is the blue box.
You can verify that the schedule
of the T gates is exactly the same.
And that this rewriting
is therefore correct.
So the next thing we do is
we flip these two wires,
leaving the Hadamards inside.
This can indeed be done because
what is inside this circuit,
inside the Hadamards
controlled-controlled-Z,
and the controlled-controlled-Z
does not care
of what is the order of qubits
that it applies to, right?
So when we flip those two
wires, we get to this circuit.
So next, recall that the Hadamard
is something that helps us to go
from the X-bases to Z-bases.
So what we can do, we can
take this Hadamard gate,
push it through this phase gate.
And what this does is
it turns this phase gate
into the V gate, because
V is the square root of,
if phase is the square root of Z,
then these the square root of the X.
So then, we can push the Hadamard here.
And this turns this phase
gate into the V gate.
And then we can push it here.
And it turns this phase
gate into the V gate, right?
And then the two Hadamards are
together, so they cancel out.
So what I have right now on the screen
is hopefully something
that you already recognize.
It's this circuit.
In other words, this circuit
for Clifford plus T library
is quote-unquote equivalent or reducible
to the one for physical level gates,
which is an interesting property.
But I guess it was obvious in retrospect,
because we use the same mathematics
to obtain both circuits,
the mixed arithmetic expressions.
Let's take a deeper look
at this implementation
of the Toffoli gate.
So specifically, let's take this gate
and commuted through the Hadamard.
Then the X turns into the Z.
In other words, we get the
controlled-Z out of here.
So what we have inside is
known as the Margolus gate.
This gate was known since 1995.
So what this means is, in a sense,
the circuit that I have
on top of this slide,
it's a quote-unquote implementation
of both the fault-tolerant Toffoli,
a physical level Toffoli,
as well as the Margolus gate, all in one.
So there is really one gate,
not three different gates
or three different circuits,
because they're all related.
Here's an interesting trick.
You can play with the
Clifford plus T library
and the T gates.
Consider the following circuit
that implements the Toffoli
gate and the T gate.
We know that the Toffoli
gate needs seven T gates,
well, it actually does need seven T gates,
even though I did not explicitly prove it,
but let's just assume
that this is the case.
And then there is another T gate
that we want to apply to a new variable.
The question is,
how many T gates do we need
to implement the circuit?
A trivial answer to this question
would be given by the number eight,
seven plus one is eight.
However, a better method exists.
Specifically, you can
create the phase polynomial
with 15 terms that looks like
this, and it computes zero,
it competes nothing, it
computes the identity.
Seven terms in this polynomial
compute the Toffoli gate.
And one term computes the T
gate applied to the qubit d.
This means that eight out of 15 terms
is what is being implemented.
However, rather than
implementing these eight terms,
we can instead implement
the remaining seven.
And the transformation
will still be the same,
because, well, because this
polynomial computes zero.
This means that you can apply
the Toffoli gate and the T gate
at the total cost of seven gates.
In other words,
if you implement a Toffoli
gate fault-tolerantly,
you get another T gate for free,
just because you implemented Toffoli.
All right, let's consider something
that is important in both classical
and quantum computations,
another function.
How would you, suppose you
wanted to use a quantum computer
for something simple like the adder.
How would you go about synthesizing it?
So first step would be writing it down
as a formula, and a Boolean formula.
And then this Boolean formula
use the product and exclusive OR
because this is natural
to quantum computers
because of the CNOT and the Toffoli gate.
So first, I quickly noticed
that the second component,
a plus b plus c, can be easily
obtained by a pair of CNOTs.
One like this, and one like this.
And then this expression,
the first component can be
obtained by three Toffolis.
Indeed, a plus b is
computed by this Toffoli,
b plus c is computed by this Toffoli,
and a, c is computed by
this Toffoli, all right?
So this is a circuit that
implements the adder.
However, we can notice
an interesting property
of this circuit is if we exchange
the order of these two gates,
then the middle one disappears.
So something like this.
In other words, to implement an adder,
We need two Toffolis and two CNOTs.
This circuit has been known
to Richard Feynman since 1985.
And I personally believe
this is how he obtained it.
Just looking at the Boolean formulas
and perhaps not doing this
circuit transformation,
but noticing that you can implement
the product of b and c exclusive OR
with the product of a
and c by one Toffoli gate
if this Toffoli gate is
used after the CNOT a b.
All right, however, the Toffoli
gate is a composite gate.
We need to,
if we want to implement
it on a quantum computer,
we need to break it
down into smaller gates.
And specifically, on this picture,
I broke down the Toffoli gates
into the two-qubit gates,
the ones that are directly implementable
on physical level quantum computers.
So here, the two Toffolis are,
I highlighted them in green on this slide.
So let's take a look at this circuit.
In this circuit, we have two
CNOTs that come together.
They can be canceled out.
We don't have to apply them
because the two CNOTS
simultaneously apply,
they compute their identity.
Then similar thing happens here.
Delete these two CNOTs.
Next, this gate and the next
are inverses of each other.
We delete them.
So now this circuit has
six entangling gates left.
The last thing that we
will do is we will move
this gate past these three gates like so.
The reason we do that is
because now we can draw
vertical red guidelines to illustrate
that the depth of this
implementation is four.
So recall how the implementation
of Toffoli was five,
but of the adder, it is four.
This is important and interesting,
because it illustrates a difference
between quantum and classical computers.
So specifically, let's consider
classical implementation
of the Boolean AND function
and the full adder,
as well as quantum
implementations of the AND,
which is given by the Toffoli
gate, and the full adder,
which has given by the
circuit we just developed.
So the depth of AND in
classical computing is one.
It is given by the Boolean AND function.
The depth of adder is three.
So here is the critical
path I highlighted.
So it touches three gates, and
therefore the depth is three.
So the depth of adder is
greater than the depth of AND,
which makes total sense.
Because had it been the other way around,
then to compute Boolean AND,
we would have used the adder.
We would have used the
adder with inputs a and b,
and we would set this to zero.
And when we set this to zero,
you can compute, this output
can compute a times b.
And adder becomes the Boolean AND.
And therefore, it can never happen
in classical computation
that the depth of AND
is higher than that of the depth of adder.
However, in quantum computation,
the depth of Boolean AND
as given by the Toffoli gate is five.
However, the depth of adder is four,
and therefore, adder has
smaller depth than the AND,
which would not make sense classically,
but that's just life
in quantum computation.
Quantum computation is different.
This is one of the illustrations
of how quantum computation is different.
So the circuit that we just developed
that implements the quantum,
the full adder quantumly in depth four,
we constructed it by hand using logic.
And in reality, first time
this circuit was found,
it was found by the application
of a template optimizer.
In other words, this circuit,
however small it is and easy to construct,
originally was found by a computer.
So let me describe what the templates are
because templates is a simple
quantum circuit optimization algorithm
that is very useful.
So to describe the templates,
I rely on the following
three observations.
Observation number one,
given the circuit identity,
we can create a circuit
that computes the identity out of it.
For each circuit that
computes the identity,
I can create a rewriting rule
where I can take some p gates
and replace with m minus p gates.
And this is beneficial
so long as p is greater than m minus p
in terms of quantum circuit optimization.
And observation number three
is whenever I have the circuit
computing the identity,
I can take its first gate,
move it to the last position,
and still computes the identity function.
In other words,
circuits computing the identity
functions are not strengths,
but they're like loops of gates,
where it can cut the loop at any place
and roll it into a string,
and the string computes the identity.
That's the intuition behind it.
Not surprisingly, a template
is a circuit that implements
the identity function.
And we will use those templates
to optimize quantum circuits.
Let me show how.
Oh, you don't have to
read off this screen.
So first, we, given a template,
we need to be able to
create an optimization rule.
Suppose that we have a
template with seven gates,
A, B, C, D, E, F, G,
and we choose, it's seven gates,
so we choose parameter p equals four.
We want to replace at least four gates
with the remaining ones in order for,
in order to see the reduction
in the number of gates.
Then the starting gate is
B, well, we can choose any,
let's say we chose B,
and there are two directions
We can walk a circle,
clockwise, counterclockwise.
So here is backward,
meaning backwards with respect
to the alphabetic order.
So we start with B and it's inverse
because we read backward,
and it must be noted that
in quantum computation,
every time a gate is available,
its inverse is available at the same cost.
It's not a solid rule,
but it so happens to be the case,
in every quantum information
processing system
that I studied, including
liquid-state NMR,
trapped ions, superconducting
circuits, QIP.
So we read four gates, B, A,
G, F, going in this direction,
and then we can replace them
with the remaining ones
read in reverse order.
So the reverse order would be this order,
so we would read the gates as C, D, E.
So this is how you create
a replacement rule.
Now, how do you find a
replacement rule in a circuit?
So let's take a look at this picture,
and the algorithm that accomplishes that
is motivated by, if not
to say directly replicates
algorithms from classical
computer science,
the string matching.
So when we look at a certain gate,
everything here is color coded.
So the different shades of green
correspond to the gates that are identical
and the blue is just any kind of gates.
So we start matching by
matching this gate to this,
they're the same shade of green.
We look towards the
beginning of the circuit.
We try to match more gates.
This doesn't match, doesn't
match, doesn't match.
This one matches.
So we,
we know that we matched it.
At this point, we verify
that these two gates
can be moved to the same location.
Then we continue the
matching, because two gates,
we can replace with three gates,
but, you know, what's
the purpose in doing that
because we just make the
circuit more complex.
We try to match more, and here
is one more that we match.
And now we have, we accumulated
the set of three gates.
We verify that they can all be commuted
to the same location.
And once we verify that,
we prepare the replacing circuit
by inverting the order of the other
two gates in the template,
erase these three gates,
and insert the two gates into a position
where the original three
can be commuted to.
So here is a specific
example of the templates
that operate with the CNOT
and controlled root of NOT,
where V0 and V1 are denoted,
can be either V or V-dagger,
but they have to be different.
And let's optimize this
circuit using the templates.
So what we do is we start from
the beginning of the circuit
and we always look towards
the beginning of the circuit,
but we increment the
gate that we'll look at.
So starting with the first gate,
well, ending with the first gate,
we don't match anything.
Ending with the second,
we don't match anything,
nothing, nothing, nothing, nothing.
Ending with this gate,
we match these two gates.
And they can be commuted together,
because these two gates commute.
So we perform the commutation.
These two gates are now
inverses of each other.
They can be both erased from the circuit.
The circuit has become shorter,
and we continue the matching.
Ending with this gate, nothing,
ending with this gate, nothing,
ending with this gate, we
match these three gates.
And again, we need to
check for the commutation.
So if these three gates can be commuted,
they can be replaced with these two gates.
So let's check if they can be commuted.
So this can commute with this,
and this can commute with this.
So they can be commuted to the center.
Let's perform the commutation.
So one part of the
commutation, the second part,
and the three gates are now together,
and they can be replaced with these two
that I have on the right hand side.
So now the surrogate is shorter.
And this concludes the optimization
of this particular circuit,
because we reached its end.
And incidentally, it also
concludes my presentation.
All right, thank you for your attention.
I'll take questions now.
Yes, please, yeah,
please help me with that.
Right, so we, let's start
with a single-qubit gates.
And if I'm interpreting
the question incorrectly,
please correct me so that
I'm answering it correctly.
So in terms of the, we
first have to decide
if we work with physical
level transformation
or a fault-tolerant transformations.
If we work with physical
level transformations,
then frequently, we have access to a gate
that is parametrized by
a real valid parameter,
such as Rx, Ry, or Rz.
And in order to implement
arbitrary single-qubit transformation
using those physical level gates,
what we can do is we can
rely on the Euler's formula
that decomposes arbitrary unitary U
into the product of axial rotations,
such as, for example, X-rotation,
Y-rotation, or Z-rotation.
They can come in various orders.
For instance, Y-Z-Y is a
perfectly valid option.
But there has to be at least
two different rotations
and no two rotations about
the same angle can be,
can be neighboring.
So this is how you implement it
using the physical level computations.
In terms of the
fault-tolerant computations,
the way to compute single-qubit unitaries
is to decompose them into
Hadamard and T gates.
And it is known how to do that optimally.
And in a nutshell,
just to describe how this algorithm works,
I can maybe find a link and share,
it actually happens to be
a paper that I co-authored
with Vadym Kliuchnikov and Michele Mosca.
So the way you do it is
you look at the matrix
and first approximate the
matrix by one over the ring,
that is the integer extension,
with the elements that
are integer extension
of complex value I
and one divided by the square root of two.
And then everything can be expressed as,
over an algebraic field.
Well, it's not field ring.
But what you do is you apply
the sequence of Hadamard and T gates
to progressively reduce the power
of the square root of
two in the denominator.
Until such times that
this power becomes zero,
and then you have pretty
much a Pauli gate.
In terms of the two-qubit gates
in the physical level implementation,
what you need to do, and it depends
on your physical level implementation,
on the kinds of transformations
that you can implement physically.
But you need to find,
if it is something that is locally,
in other words, single-qubit
gate equivalent to a CNOT gate,
then you transform the
transformation that you want
to the physical level
interaction that you have.
If that is not the case,
you can use an algorithm to decompose
an arbitrary two-qubit
unitary using no more
than three CNOT gates into a circuit,
and then each of the
three CNOT is substituted
with a physical level implementation.
So that is two-qubit gates
using physical level circuits.
And finally, two-qubit gates
using fault-tolerant circuits
is pretty much the same,
except you decompose into,
let's say a three CNOT
and single-qubit rotations.
But single-qubit rotations you approximate
by how Hadamard and T circuits.
So hopefully that answers the question.
- [Moderator] Thanks, Dmitri.
I'm gonna turn my microphone on
so the audience can hear the questions.
Brenda asks, how do you know
if two gates can commute?
- You can, if you have like two gates
and let's say this, a gate A and a gate B,
you verify by matrix computation,
if A times B equals B A.
And if that happens to be
the case, the gates commute.
And if it happens to be not the case,
then they don't commute.
So it's a simple matrix multiplication.
- [Moderator] Great, and
the followup, she asks,
can you explain more about why T gates
are the most expensive resource?
- [Dmitri] So in a nutshell,
this is because the Clifford
gates are transversal,
meaning that in order to
implement them fault-tolerantly,
you basically implement
them on the physical level,
but using a larger number of qubits.
And the T gate cannot
be obtained this way.
In fact, non-Clifford gates
cannot be obtained this way.
However, the T gate can be clarified,
not clarified, but purified, sorry.
And there is a paper by Sergei
Bravyi and Alexei Kitaev
that talks precisely about that.
So I highly recommend that paper.
It's a very good result.
- [Moderator] Okay, thank you.
A couple of questions on,
if you have any recommendations
on books or papers
about optimization or
anything around this topic.
- [Dmitri] To the extent that I am aware,
there is currently no book available
on the topic of quantum circuit
synthesis and optimization.
And there is pretty much
the Nielsen, Chuang,
but Nielsen, Chuang reports a few circuits
that maybe dedicates 15 to 20 pages
to the topic of quantum circuits
and quantum circuit optimization.
But it doesn't really, at least to me,
it doesn't really explain how
those circuits were obtained.
The ones that it reports.
So a short answer is, sorry,
there is currently no recognized book
that I could recommend.
- [Moderator] Thank you.
We do have the Qiskit textbook.
- [Dmitri] Yes, so yeah, we
can recommend that, yeah.
- [Moderator] I'll drop a
link to that in the chat.
Any other questions
before we wrap up today?
Let me get the chat a second.
Okay, here's, I'll drop
the chat, textbook.
All right, it doesn't
look like any other talks,
or excuse me, any other questions, so-
- [Dmitri] I can probably
like, after this is finished,
add a few links to the papers,
to various papers that constitute,
that I mentioned during
some of the answers.
- [Moderator] Great, yeah.
Send me those, and I'll
put them in the description
for the video.
- Okay, I will do that, yeah.
- [Moderator] Follow up on that.
We do have one last question from Amir.
Any thoughts on ZX-calculus
as it relates to optimization?
- Oh, all right.
So ZX-calculus,
it performs pretty much
the same operations
as the phase polynomials,
but it is visual.
So for the purpose of optimization,
I don't really see myself a big difference
between the phase
polynomials and ZX-calculus.
I find the phase
polynomials more convenient
because I personally have
more intuition about them.
That's why I rely on phase polynomials.
- [Moderator] Awesome.
- But I don't think there is anything
that one paradigm can do
that the other cannot.
- [Moderator] Great, thank you, Dmitri.
Where can people find you online?
- They can write me an email
at dimitri.maslov@ibm.com.
Maybe I can, maybe I can
add it in the chat window.
Yeah, please add it in the chat window
so that I don't have to type.
I have a desktop on the
background of the laptop.
I'm using laptop for presentation.
So the chat windows and the
desktop, so kinda hard to reach.
- [Moderator] I'll put that
into the description as well.
So if you wanna get in touch with Dmitri,
send him an email.
Thank you again, Dmitri,
and thank you everybody for tuning in.
We've got another seminar
session on Friday.
So tune in at noon Eastern.
Thanks, Dmitri.
- All right.
Thank you everybody for joining.
