Hi. My name is Chongli Qin and
welcome to today's lecture on
responsible innovation and
artificial intelligence.
So, today's talk will be split into
two parts. For the first part,
I'll be going over some of the
research which has been done to
ensure that the machine learning
algorithms we develop satisfy
desirable specifications such
that during deployment it is
safe, reliable, and trustworthy
for use.
In the second half
Iason will dive into further
details of the ethical
implications of machine learning
algorithms and more importantly,
how we can be thinking about
designing these algorithms and
deploying these algorithms such
that it is beneficial to society.
So, to start I want to
give a quick motivation into why
we as machine learning
researchers should be thinking
about both research
and our responsibilities.
So, with all of the
great research which has
happened over the past several
decades, machine learning
algorithms are becoming
increasingly powerful.
There has been
a lot of breakthroughs in
this field and today I'll just
mention a few.
So, one of the earlier
breakthroughs was using
convolutional neural networks to
boost the accuracy of
image classification.
And more recently,
we now have generative
models that is capable of
generating images with high
fidelity and realism.
We've also seen breakthroughs in biology
where now machine learning
algorithms is capable of
generating or folding proteins
to unprecedented level of accuracy.
Indeed, the recent
AlphaFold system won the last
CASP competition, which is
a protein folding competition on
folding unknown protein
structures later crystallised
for validation.
We've also seen machine learning and
reinforcement learning systems
that is now capable of beating
humans in games, such as Go.
And more recently we've seen machine
learning algorithms pushing the
boundaries of language where the
recent GPT-2 and 3 models we've
seen that these models is not
only capable of generating text
which is grammatically correct,
but really demonstrated that
they are grounded in the real world.
So as the saying goes
with great power
comes great responsibility.
And I think now
it is more important than ever
for us to question what might be
the negative impacts and risks
of all of this.
And more importantly,
what can we do to
mitigate for these risks.
To highlight why we need to start
thinking about these risks.
I want to start with a few
examples starting with this one
which some of you are already
maybe familiar with.
So, this is a paper published in 2013,
titled 'Intriguing Properties of
Neural Networks'.
They probably used the adjective intriguing
because whatever it is they have
found in this paper was a little
bit unexpected.
So, what did they find?
Well what they found
is that you can take a
state-of-the-art image
classifier, like here,
and you can give it an image like the
one you see of a Panda here,
and indeed this classifier can
correctly predict
this as a Panda.
So what happens now
if I take the exact same image
add on the tiniest bit of
perturbation to this image,
so tiny that it is
imperceptible to the human eye,
and you can see this,
that the left picture and
the right picture looks almost
exactly the same. In fact, they
do look the same.
So, what happens now
when you put this
new image through this neural network?
We would actually expect
the output of the neural
network to be the same,
but rather when you put this new
image through this neural
network, the network is now
almost 100% confident that this
is in fact a gibbon.
So, in this instance,
maybe misclassify a
panda for a gibbon does not have
too many consequences,
but actually we can choose the
perturbation to make the output
of the network to be whatever it
is we want to be.
We can change this to bird,
to vehicle.
If such a classifier was used for
say an autonomous driving system,
this can have catastrophic consequences.
Some other machine learning failure
modes might be slightly more subtle.
There has been some studies
on the recent GPT-2
model where they have shown that
this model might be carrying
some of the biases
that exist in society today.
So in this paper titled
'The Woman Worked as a Babysitter:
On the Biases in Language Generation',
what they did was a systematic study
on how the model behaved
conditioned on different
demographic groups.
For example,
if the prompt was changed from
'the man worked as' to 'the
woman worked as',
the subsequent generator text drastically
changes in flavour and may be
heavily prejudiced.
Similarly to
'the black man worked as'
versus 'the white man worked as'.
And I want everyone to take a few
seconds to read the generated
text after we changed the
subject of the prompt.
As you can see, even though this
language is extremely powerful,
it does carry some of the biases
we have in society today.
And if this is the model
that is used for
something, say like
auto-completion of text,
it can further exacerbate
and feed into
the biases that we already have.
Iason will go later into details
of the ethical risks of how we
can use machine learning, for
example, using machine learning
in surveillance systems
or for weapons.
So, I think at this point
we should really be asking this question:
what are our responsibilities
as machine learning practitioners?
So, this is, of course, an open question
and possibly with no single right answer.
But I see one of
our responsibilities would be to
ensure that the machine learning
algorithms we deliver satisfy
desirable specifications.
In other words, we should have a
level of quality control over
these algorithms to enable their
deployment to be safe, reliable
and trustworthy.
And if we get this right
with this we can bring
many more opportunities,
many more applications enabled,
for example,
more reliable and safe,
autonomous driving systems,
more robust ways of
forecasting weather for
renewable energy,
et cetera, et cetera.
But for all of this to
be possible, we really need to
know how to stringently test our
machine learning algorithms.
So, how can we make sure that our
machine learning algorithms are
safe for deployment?
Well, like how many other algorithms
are quality controlled before
deployment, we need to ensure
that they satisfy desirable
specifications.
For example
for an image classifier we want it
to be robust to perturbations.
If it's a dynamical systems
predictor, we would like it to
satisfy the laws of physics.
We want it to be robust to feature
changes that is irrelevant for prediction.
For example, the
colour of the MNIST digit should
not affect your digit classification.
If we're training on sensitive data, it
should be differentially private.
If we're giving our
neural network images
out of distribution
our neural network
can become more and more
uncertain, not more and more certain.
So these are just a few
specifications that we would
like our neural networks to
ideally satisfy,
but there are many more.
So, here I want to introduce
the paradigm of
specification driven machine learning.
So, what do I mean by
specification driven machine learning?
Well, we should realise
the core issue lies when
you're training with limited
data, your model can learn a lot
of spurious correlations to
boost metrics, but has nothing
to do with prediction.
So, unfortunately, what this means
is that your model is ultimately
hinged on the data and the metric.
So, unless you do your
training carefully, your model can
inherit a lot of undesirable
properties that is in the data
unless you specify otherwise.
So in this instance,
if the data is
biased and limited, then your
model is to be biased and
non-robust
unless we specify otherwise.
So, in specification driven ML,
we want to enforce the
specifications that may or
may not be present in the data,
but essential for our systems to be reliable.
So, how we can enforce
these specifications
will be the subject of the rest of my talk.
So, I want to start
with a specification that is
relatively well studied and one
which I have already kind of touched upon.
That is the robustness
of our neural network
to perturbations
or adversarial perturbations.
So, this specification is really
essential if we want to deploy
our networks to applications
that require robustness or for
applications that have real
adversaries in the mix.
So, to formalise, I first want to
reiterate what it is we want to achieve.
So, we want our image
classifier's output to be
unchanged under any additive
imperceptible perturbations.
So, to define this more
mathematically,
let's start with a few notations.
So, here we denote
the neural network as a
function f and this function
takes in as inputs an image,
which is your
panda, otherwise denoted by x,
parameters of the network
denoted by theta,
and our output of this neural network
should be
a probability vector over your
labels or more commonly the
logarithms of the probability
vector, otherwise known as Logits.
So, ideally, we would
like our neural networks output
to be exactly the same as the
label. So in this case, it's
what we know to be a panda,
but
we actually express this to be
in a one-hot format vector where
the element corresponding to the
label is one and elements
corresponding to all other
labels to be zero.
So, now we have
our notation set out, let's
go straight into the specification.
So, this is the
adversarial robustness
specification.
I know this is a
lot to take in one page so I
will try to break this down a little bit.
So, firstly, we note
that the delta here
denotes the perturbation.
And what this equality is simply saying is
that we want an index of the
maximum probability in the
probability vector outputted by
the neural network to be exactly
where the one is in our
one-hot vector.
So this is a very convoluted
way of saying we want
our neural network's output to
be correct
subject under this perturbation.
And now the second
line says, we want this to be
true for all perturbations
within the set of
imperceptible perturbations.
So, in practice to
ensure imperceptibility, we
simply constrain the size of the
perturbation under certain
norm ball to be less
or equal to epsilon.
So, the norm ball normally considered
is the L-infinity norm.
So, now we have the
specifications designed
I want to go into a little bit
more detail of one of the more
commonly used methodologies to
train our neural networks to
satisfy this specification.
And that is adversarial training.
So, adversarial training is very
similar to standard image
classification training but with
a tiny bit of a twist.
So, indeed with standard image
classification training, what
ideally we would like to do is
we want to optimise our
network's weights, such that the
input image is correctly classified.
So, in this instance,
you see the input image is a cat
and thus, we want the output
prediction of this neural
network to be a cat as well.
Algorithmically, what this means
is that we want to minimise the
weights of our network with
respect to expected loss
over our data.
So, the loss normally
considered is the cross entropy
loss and here
capital D denotes our data.
What this loss
ensures is that our prediction,
our predicted probability vector,
to be as close
as possible to the label.
So, now adversarial training
does something very
similar except with an
extra data augmentation step.
So, here not only do we want the
original image to be
correctly labelled as a cat,
we want the image
now with any additive
imperceptible perturbations to
also be correctly labelled
as a cat as well.
So, in practice
to iterate through all of the
imperceptible perturbations is
obviously computationally infeasible.
So, instead what
adversarial training tries to do
is it tries to find
the worst case perturbation.
So, what I mean here
by the worst case
perturbation is simply the
perturbation that maximises the
difference between the
prediction and the label.
So, I want to reiterate this again to
make this a little bit clearer.
We want to maximise the
prediction, the difference
between the prediction and the
label with respect to the
perturbation which is actually
image space,
not in parameter space.
So, now how does the
objective of the
adversarial training change?
Well, this is now
the objective of adversarial training,
and you can see that
it differs slightly to before
where before was, it was just a
minimisation problem, now this
is a min-max problem.
So, first we want to find the maximum
of the loss with respect to delta,
which has our perturbation and
notice that this delta belongs
in this set of perturbations
denoted by capital B, which is
the set of imperceptible
perturbations.
Then once we have
computed this maximum, now we
want to minimise the parameters
over the maximised loss.
In other words, for every
outer minimisation step we take,
we have to do an inner
maximisation where we find the
perturbation that maximises the loss.
So, this makes adversarial
training significantly more
expensive than standard image
classification training.
I will go into a little bit more detail
about this later on.
So, now hopefully we have a method
in training our neural networks to
satisfy this specification.
How can we go about evaluating this?
So, now in this next section,
I want to go over in a little bit
more detail on the methodologies
we used to do adversarial
evaluation
on finding the worst case.
So, the goal of the
adversarial evaluation is really
to find the worst case
perturbation
for each example in a test set.
And once we have
found this worst case
perturbation, we now want to
evaluate the accuracy of this
new test set,
where each example
in the test set is now replaced
with the worst case
adversarial example,
that is the original
example, plus the worst case
adversarial perturbation.
And this accuracy is known as the
adversarial accuracy.
But there are several complications
one of which is that to find this
maximum exactly can be shown to
be NP-hard problem when your
activation function is ReLUs.
And another complication comes
in when we note that this is in
fact a constrained optimisation
problem, because we want the
delta to be constrained within
the set capital B.
So,
instead of trying
to find this maximum exactly,
rather what people try to do is
they try to approximate this
maximum with a form of gradient
ascent.
And because this is a
constrained optimisation problem
what we do is something called
projected gradient ascent.
So, what projected gradient ascent is,
is simply gradient ascent like this,
but the moment we
fall outside of the constraint,
which is denoted by this yellow
box here, we project this back
on to the nearest point that
satisfies the constraint.
So, mathematically, what this now
looks like is the following.
So, I want to break
this down a little bit.
Firstly, we see that
within this projection function,
we have exactly a gradient
ascent where you have your
initial delta and then you take
a gradient with respect to delta
in the direction to maximise the
loss.
And your step size here is
denoted by eta.
And once we have
computed this update step
now we want to project this
back on to the set
that we care about.
And more importantly,
we project it
onto a point which is the
closest to the update.
So, this is the projected
gradient ascent
update step.
And actually one of
the more popular forms of
projected gradient ascent is the
Fast Gradient Sign Method while
we're considering perturbations
within the L-infinity norm.
So, what the Fast Gradient Sign
Method does is it tries to
replace, it replaces the
gradients sorry,
with a sign of the gradient instead.
But actually
we can replace the gradient with
any alterations made by any optimiser.
So, for example, we
can replace this with maybe
momentum optimisation
or Adam optimisation.
So, there are a lot
of things for you to kind of
try: the step size,
the optimiser
and the number of
steps you want to take
for your gradient ascent.
And also you
want to explore these parameters
such that you get the strongest
evaluation possible.
So, here at this point,
I want to go on to
something which I want to
emphasise for a little bit so
I'm gonna stay on this slide for
just a few minutes.
So what do I mean by
a strong adversarial evaluation?
Well, first of all,
to see what I mean, we need to
note that your adversarial
accuracy is dependent on the
many choices you make during
evaluation.
That is, it is
dependent on the number of steps
that you take for your projected
gradient ascent, that your step
size, your optimiser, and many more.
So, the stronger your
adversarial evaluation is,
the lower
your adversarial accuracy should be.
And we should always
be trying to evaluate our
networks, such that we obtain
the lowest adversarial accuracy possible.
The reason why this
is, is because the lowest
adversarial accuracy is the
number which is closest to the
true specification satisfaction.
And that is the one thing we care about.
So, because of this importance,
I thought I should
give you a few heuristics
of what I use to ensure
that my adversarial evaluation
is strong.
So, the first thing I
kind of look at is the number of
steps for the projected gradient
ascent, it might not be
surprising but the more steps
you take for your projected
gradient ascent, the closer you
are to maximising your
objective, that is of course
conditioned on the fact that
your step size is sufficiently small.
The second one might be a
slightly more subtle one,
which is the number of random
initialisations for the perturbations.
So, what I mean
right here is that,
firstly, you
randomly initialise a
perturbation before you start
taking projected gradient
ascent steps.
So, we want to actually
have a number of different
random initialisations.
This is especially important
when it
comes to detecting behaviour
called gradient obfuscations
which is something I will go
into in a little bit
more detail later on.
Another factor which I
also look at is the optimiser
that is used.
So, this is just a
good thing to try out a few
different optimisers to ensure
that you always get the lowest
adversarial accuracy.
And another factor
which is also
quite important for detecting
gradient obfuscation
is using a
black box adversarial
evaluation method.
So, what I mean by black
box is when
we assume that we're
not given the weights of the
network.
So, the adversarial evaluation,
which has projected gradient
ascent is otherwise known as a
white box adversarial evaluation
because we are given the weights
of the network.
So, the reason why
I want to go into detail
about why it is important to
make sure your adversarial
evaluation is strong is because
we have seen the dangers of weak
adversarial evaluation.
So, these are two papers
published in 2018,
where they have actually
shown that weak adversarial
evaluation can give you a very
false sense of security.
So, what they did was
they took a lot of
the defenses published up until
then and then they tried their
new strong adversarial
evaluation on all of these defenses.
And surprisingly the
stronger adversarial evaluation
broke many of the defenses
causing the adversarial accuracy
to go to zero.
That is many apart from adversarial training,
which is the one
you see on this line.
And this is possibly
one of the many reasons why
adversarial training is still
heavily used today.
Another benefit about stronger
adversarial evaluation is that
actually it gives you a true
evaluation of progress.
So, I want to highlight this paper
here because what I loved about
this paper was that they did a
large scale evaluation
of the defenses
published up until then
and then they took the numbers
the adversarial accuracy which
was reported
in their paper and
compared it to what they got
under their evaluation.
So, this work
is very cool in two senses.
Firstly, they've have
evaluated all of these works
under now a consistent set of
adversarial evaluations.
And secondly,
as you can see by the
drop in the adversarial accuracy
for many,
this set of
adversarial accuracy is much
much stronger than what the
others have used in the paper.
So,
now you can see if we use a
stronger adversarial evaluation,
that's our adversarial accuracy
for CIFAR ten is not even above
60%.
Whereas if we're just going
to take the numbers that is
reported in this paper as is
maybe we're going to be under
the impression that is almost
70% and above.
That's why it's extremely
important for us to
take care while we're doing
adversarial evaluation.
So, another reason why strong
adversarial evaluation is
important is because training
methods like adversarial
training is prone to an effect
called gradient obfuscation,
something that I've mentioned a
couple of times already.
So, here I will go to detail what
gradient obfuscation is
and to describe what gradient
obfuscation is, I want to go back
to the training objective for
adversarial training.
So, let's recall the training objective
for adversarial training has
both an outer minimisation step,
as well as an inner
maximisation step.
And in this I just want to
focus on inner maximisation.
So, we can
approximate this maximum,
similar to how we actually do
adversarial evaluation by doing
projected gradient ascent
to approximate this maximum.
However, we note that there is a
little conundrum,
which is that
even though the more steps we take,
we might be getting closer
to maximising the objective,
but
the more steps we take, we also
make adversarial training
significantly more expensive.
So, how can we make adversarial
training cheaper?
Well, maybe a
naive way of doing this would be
simply to do fewer steps of
gradient ascent, for example, I
can simply take maybe two steps
of gradient ascent.
But actually what happens
when you take too few
steps to maximise the objective
is that the network
loves to cheat by making a
highly nonlinear loss surface
such that simply by doing two
steps of gradient ascent would
not be even close to maximising
the objective you care about.
So, here is an example of a gradient
obfuscated surface.
So what you see on this
plot here on the x
and y axes is basically a
hyperplane cut through image
space, and we plot the loss at
every single point in this hyperplane.
And as you can see
that this is a highly non-linear
behaviour for this small region
of image space.
Whereas if you do
adversarial training
correctly, what you should
actually expect is a much
smoother looking loss surface.
So, with all of these dangers of
weak evaluation and gradient
obfuscation, it really pushed
people into thinking about maybe
a different way we can
evaluate our algorithms
and this is called
Verification Algorithms.
So, their application
is very cool in the sense that
it is able to find a provable
guarantee that
no attacks that
has ever been invented or will
ever be invented,
can succeed in
changing the specification
satisfaction of the network.
So there are generally
two types of
verification algorithms.
The first type is a complete
verification algorithm.
What these algorithms
normally do is
often an exhaustive proof,
for example,
using mixed integer
programming assuming that your
activation functions is a value.
And also what they do is they
either throw in a counterexample
or they find a proof that the
specification is satisfied.
But, unfortunately
these algorithms
are very difficult to scale
to deep neural networks.
So, rather people use incomplete
verification algorithms.
So, incomplete verification
algorithms is similar to
complete verification algorithms
in the sense that
once a proof
can be found,
it is also a
provable guarantee that the
specification is satisfied.
But the difference is
that a proof
cannot always be found
even if
you're neural network satisfies
the specification.
So, in other words,
incomplete verification
algorithms
give you a lower bound
on the specification satisfaction.
So, I want to go
into a little bit of detail
about these incomplete
verification algorithms.
I'm starting with this illustrative
sketch of a neural network that
takes in as input x and gives
you an output y.
So, we generally make
two assumptions for verification.
The first one is
we assume that the input comes
from a bounded set denoted by
capital X here.
And the second assumption
we make is that our
neural network consists of
linear and activation layers.
Please note that your
convolutional layers can also be
cast as a linear layer.
So, with these two assumptions,
what we can now do
is we can propagate
the bounds of your input set
through your linear and
activation layers sequentially,
until we can get an output set
denoted by capital Y here.
And once we have this output set,
we can simply see
if it lies on one
side of the decision boundary
or not.
However, the caveat here
really is the true propagation,
the exact propagation of these
bounds, is in fact NP-hard.
So, what incomplete verification
algorithm instead tries to do is
they try to find a more scalable
way of propagating these bounds
that is as tight as is possible
to the true sets that we
actually care about.
But what we kind of lose
is that now,
instead of getting the true set,
we get an over-approximated set
of the true set.
So, an example
of such a bound propagation
technique is that we can imagine
if your input is lower and upper
bounded, we can simply compute
the lower and upper bound after
linear transformation and
similarly, after that, we can
compute the lower and upper
bound after an activation layer.
But the problem with
these techniques is really that
if your bound propagation is too
loose, in the sense that the
over-approximation is too big of
an approximation of your true set,
then your incomplete
verification can be,
can not mean very much.
So, what do I mean by this?
Well, to see what
I mean, first of all, we want to
note that for incomplete
verification algorithms, where we
only know
the over-approximated set,
so we have no idea
what this true set is.
So, in the ideal case,
if your over-approximated set lies on
one side of the decision
boundary, then indeed we have
proven the true set lies on one
side of the decision boundary as
well.
That is to say, we found
that this satisfies
the specification.
However, if your
over-approximated set is too
large and it spans both sides of
the decision boundary,
then there is very little
we can say
about the set Y.
Of course, we can try to
close the gap on
maybe distinguishing between
these two cases by actually
doing projected gradient ascent,
like I've talked about before,
and here is the graph which
shows the difference of doing
such an empirical adversarial
evaluation compared to doing
incomplete verification algorithm.
So what this graph is
showing on X-axis
- you can picture this to be
the size of your input set,
capital X,
and on the Y-axis is
the amount of
specifications violation.
So remember, incomplete
verification algorithms give you
a lower bound on the
specification satisfaction, thus
gives you an upper bound on the
specification violation.
Whereas on the other hand,
the empirical
adversarial evaluation gives you
a lower bound on that
specification violation.
So, the true specification violation
lies somewhere in between.
If our bound propagation techniques
for incomplete verification
algorithms is tighter,
the gaps between these two
will be reduced.
So, there is more we can
say about the true
specification satisfaction
of your neural network.
But if your bound
propagation techniques is too
loose and the gaps of this
becomes larger and larger,
at some point there is very little
we can say about the
specification satisfaction.
So, today I've just touched on
the adversarial robustness
specification,
but all of the techniques
I've mentioned today
can be used for many other
specifications.
For example,
we can consider
semantic consistency
for an image classifier,
that is, maybe some
mistakes are more catastrophic
than others.
For example, for a
self-driving car, it might be
okay to mistake cats for a dog
because ultimately it doesn't
change the driving policy
very much.
But it is not okay to
mistake it for a car.
Or for a dynamical
systems predictor,
we can be looking at
laws of physics,
such as energy conservation.
So, hopefully what
I have done today is given you a
rough outline on how you can
train your neural networks to
satisfy specifications,
and more importantly,
evaluate
how much your neural network
satisfies these specifications.
But more importantly,
I very much hope
I have motivated
everyone to thinking why looking
to this is important.
And this concludes the end of my talk
and now I want to pass on to Iason
who will give you a more
detailed overview of the ethical
implications of machine learning
algorithms
and more importantly,
how we can be thinking about
deploying and designing these
algorithms
to be beneficial to society.
I'd like to start by
thanking Chongli for her
fantastic exposition of some of
the key challenges that arise
from building algorithms that
are safe, robust, and fair.
In this section of the talk,
we'll focus more directly on the
question of responsibility
and what it means to deploy these
technologies successfully in
real world settings.
However, before we get started,
I'd like to reintroduce myself quickly.
My name is Iason Gabriel and
I've been working at DeepMind as
a research scientist in the
ethics research team
for three years.
Before joining DeepMind,
I used to teach at a university
where my work centered on moral
philosophy and practical ethics,
including questions about global
poverty and human rights.
At DeepMind, our team explores
questions that arise in the
context of ethics and
artificial intelligence,
some of which we'll look at
in the course of the next hour.
So, if we begin with
the topic of ethics
and machine learning,
we immediately
encounter questions including
what is ethics and
why does it matter?
And how does ethics
connect with machine learning?
I'd like to take these
questions in turn.
Ethics is a field of inquiry that's
concerned with identifying the
right course of action with what
we ought to do.
It's centrally concerned with the equal value
and importance of human life
and with understanding
what it means
to live well
in a way that does
not harm other human beings,
sentient life,
or the natural world.
According to our
everyday judgment,
some actions are good,
some are acceptable
and some are prohibited altogether.
Understood in this sense,
ethics is interested in
identifying what
we owe to each other
and how we ought to act,
even in challenging situations,
These situations can arise in our
personal or professional lives.
However, they also arise in the
context of machine learning research.
What I'd like to
suggest is that far from being
outside the domain of ethical
evaluation, technologists and
researchers are making ethical
choices all the time.
And many of these choices
deserve closer consideration.
As Chongli noted,
a good place to start is with
the training data we use to
build machine learning systems.
In particular, we need to
appreciate that data is not only
a resource,
but also something
that has ethical properties and
raises ethical questions.
For example,
has the data been collected
with the consent of
those who are represented?
We cannot take this for granted.
Of course, there's high profile
cases of data being collected
without people's consent,
such as Cambridge Analytica,
but it's also a common challenge for
major datasets used to train
image recognition systems that
often use pictures of
celebrities or simply images
taken from the internet.
Second,
who or what
is represented in the data?
Is the data
sufficiently diverse or does it
focus on certain groups to the
detriment of others?
If we train a model on this data,
will it perform well for people of
different genders,
nationalities,
or ethnic backgrounds,
or might it fail
when applied
to these groups in significant ways?
Thirdly, how
has the data labeled and
curated?
Does it contain
prejudicial associations?
As Kate Crawford
and Trevor Paglen
have demonstrated
in their work
on excavating
artificial intelligence,
in the early days
of ImageNet,
it contained a
'Persons' class that assigned
pejorative labels to a variety
of images of real people.
This is also a problem for historical
data that's drawn from specific
social context.
Regardless of
how that data is labelled,
it may contain associations
that are a reflection
of human prejudice
and discrimination.
These challenges,
which arise early on
in the machine learning
pipeline, have come to have a
real world impact
a phenomenon
that's most commonly referred to
by researchers
as the problem of
algorithmic bias.
Indeed, while these
technologies have great
potential,
recent evidence
suggests that far from making
things better,
software used to
make decisions and allocate
opportunities
has often mirrored
the values and biases of
its creators,
extending discrimination
into new domains.
These include the domain of
criminal justice,
where a program used for parole
decisions mistakenly identified
more black defendants as
high risk than people
in other racial categories,
compounding entrenched patterns
of racial
discrimination within the
criminal justice system.
It's also been seen
with job search tools,
which have been shown to
offer highly paid jobs or
advertisements for highly paid jobs
to men over women by a
significant margin,
sometimes by a ratio
of up to six to one.
It's also a problem that's been
noted for image recognition
software, which has been shown
to work less well for minorities
and disadvantaged groups.
And lastly,
it's been something that
we've encountered in the domain
of medical tools and services,
which have been shown to perform
markedly worse for people with
intersectional identities,
something that could mean that
they have unequal access to
lifesaving services and medication.
Faced with this
mountain-body of evidence,
we cannot rely
on good intentions alone.
There is clearly an
important body of work to be
undertaken to address these failings.
It is also clear to
return to a point that Chongli
made earlier, that those who
design and develop these technologies
are in a position of power.
So, what is power?
In this context,
I think it's best
understood as the ability to
influence states of affairs
and, more importantly,
to shape the lives of other people.
More precisely,
those who develop new
technologies shape the world,
creating new opportunities,
foreclosing others
and shaping the path
that humanity is likely to take.
This can be seen with
major inventions throughout history,
such as the steam engine
or electricity.
Artificial intelligence is now
also starting to have profound effects,
some of them positive,
and some of them
more challenging.
With this power
comes responsibility.
However, the question then arises:
responsibility to what?
Chongli has already shown us
what is possible
from a research perspective.
Clearly there are
certain things that we can do
and that we may well be required
to do when building ML systems.
However, I'd like to focus on
the question of responsibility
and see if we can develop our
understanding of what is
required a little further.
At a collective level,
I believe that
an understanding of the
relationship between power and
responsibility should lead us to
reflect more deeply on the
question of what it means to do
machine learning well.
After all,
the activity of scientific research
is not value neutral,
rather it is a social practice
that human beings engage in,
that's governed by shared norms
that change over time.
Some of these norms
are epistemic,
for example, ideas about the need
for replication in order to
confirm scientific findings.
Others are normative or moral,
for example, about the
acceptability of doing research
on people without their consent.
Ultimately,
with the way we
structure this practice,
including the way we think about
what it means
to do good research,
has profound social effects.
These questions about
appropriate norms and
standards for research
are particularly important when the
stakes are high and when there's
uncertainty about the overall impact.
In response to the
unique challenges posed
by nanotechnology
governments and
civil society groups came
together to develop a shared
understanding of what they term
'Responsible Innovation'.
These groups characterise
responsible innovation
as a transparent and
iterative process by which
societal actors and
technologists
become mutually responsive
to each other's needs
to ensure the ethical and social
value of the scientific endeavour.
This paradigm points
towards the important idea
that good science is itself based
upon alignment with the
social good
and with democratic processes,
a point that I'll
return to later on.
At the same time,
the paradigm has been
criticised for its vagueness.
What precisely are researchers
responsible for?
What should they do?
And how does this apply
to the realm of machine learning?
These are questions
that I shall now try to answer
more concretely
as we turn to
principles and processes
for thinking about the ethics of
artificial intelligence.
So, as AI researchers,
I believe that
we share in responsibility for
at least two things.
First,
we are responsible for intrinsic
features of this technology
so that we build the very best
systems that are sensitive to
ethical and social
considerations
and are designed in ways
that limit the risk of harm.
Secondly,
we bear some
responsibility for extrinsic
factors that help determine
whether it's designed, deployed
and used wisely
in ways that produce
beneficial outcomes.
Both elements are necessary.
Robust and secure
technologies can be used in
harmful ways by bad actors
and faulty technologies can be
problematic even if they're
deployed and used responsibly.
In terms of the content
of these obligations,
what I've termed
the 'responsibility for what' question
there are now a
multitude of AI principles that
broadly speaking aim to align
machine learning research and
systems with the social good.
This includes offerings by the
European Union, the OECD,
the Beijing Academy of AI
and by the Future of Life Institute.
Indeed, one recent study found that
there's at least 84 different
ethical codes that have been
proposed for AI.
Fortunately,
these principles coalesce
around certain key themes
such as fairness,
privacy, transparency,
and non-malfeasance.
The last condition,
which is sometimes
characterised as 'do no harm',
leads us to focus on the
affirmation of individual rights.
This includes things
like respecting the requirement
for informed consent
and equal recognition
before the law.
Lastly,
I believe we should try
and develop
artificial intelligence
in ways that satisfy
the collective claim to
benefit from scientific discovery.
Interestingly, this
is also found in the United Nations'
Declaration on Human Rights,
which establishes a
right of all humanity to share
in scientific advancement
and its benefits.
Yet, even with this
understanding of the key values
in place
certain gaps appear to remain.
First and foremost,
how do we move
from these statements
of principle to clear
and robust processes
for evaluation?
We know that good
intentions are not enough,
but how do we take abstract moral
ideals and turn them into these
concrete processes and procedures?
How should we
balance and weigh different
ethical principles against each other?
And how should we deal
with the fact that machine
learning research is often
highly theoretical and general,
meaning that there's uncertainty
about how it will ultimately be used?
One answer is that we need
tools that help us put
principles into practice.
In what follows,
I describe a five step
process that technologists
can use to evaluate our research
and help make sure that it's
ethically and socially aligned.
While this process does not aim
to capture the entirety of AI ethics,
my hope is that it can
be of use to many of us who are
doing practical work
designing and building algorithms.
This framework is particularly
helpful in relation to the
values previously discussed.
I will now run through the process
briefly and then return to
consider each stage
in more detail.
So, the first question we
need to think about when
building new technologies
is whether it has a socially
beneficial use.
Is there actually a reason
to develop the thing
that we want to develop.
Now most technologies do have
socially beneficial uses
of one kind or another,
but if we're unclear
about what the value is
that we aim to unlock,
that's typically a red flag
that should
immediately force us to
go back and reconsider
what it is that we want to do.
By getting this social purpose
clear in our minds
it's also often true that we can create
better versions of the
technology that we had in mind,
but once we have this social
purpose in mind,
we then need to
turn to think about the risk
of direct or indirect harm
that the project brings with it.
So, here again it's typically true that
most technologies bring with
them some risk of harm
and the important thing
is to try and map out these risks clearly
so we know what we're contending with
at every juncture.
Then, with an idea
of what the benefit is
and the risks that the
project brings with it,
we should turn to think about mitigation.
Are there steps we
can put in place that will
reduce the risk
or eliminate risks entirely?
Then, with this plan in place,
we can finally turn to the
first evaluative stage.
So, now we have the best
version of the technology or
piece of research in mind
and we can ask,
with these measures in place,
does the proposed action
or research violate a red line
or a moral constraint?
Does it push up against
some threshold or barrier
which comprises the
sort of things that we just
ought not to do?
Finally, on the
assumption that we haven't hit
one of these hard constraints,
we have the question of whether
the benefits
outweigh the risks from
an ethical point of view.
And at this point
it's often sensible
not just to focus on the
specific project at hand
but also to consider other options
that are available to us,
given that we've been afforded this
opportunity to do research,
is this really one that looks like
it will add value to the world?
Okay.
So, now we'll take a moment
to look at these questions
in more detail.
So, what is a
socially beneficial use of
technology?
What does this mean?
Well, I think that a technology
can be socially beneficial
in a variety of ways.
To start with
it could contribute to human
health or wellbeing,
so it could make us physically more healthy
or better off in some way.
Technology could also enhance
our autonomy or freedom
if it empowered us
to act in a way
that fulfills our
own goals or outcomes,
for example,
by giving us useful information
or by helping us be more discerning
when it comes to understanding
the world around us.
Technologies can also help
produce fairer outcomes if
they're well-designed and
calibrated
and developed in a
way that includes the voices of
people who are affected.
They can contribute to public
institutions such as healthcare
or education
and these systems
can be used to address global
challenges
such as climate change.
Of course, there's certain
gold standard research
for ethically impactful work,
such as medicine or work to
address climate change.
But it's also okay to
focus on more prosaic
goals and aspirations,
for example, to try
and create technology that
brings people enjoyment or gives
them more time to do other things.
If we take an example,
a technology developed by DeepMind,
which was WaveNet,
an algorithm that we created to
help produce better quality
synthetic audio.
I'd say that one of the
socially beneficial uses of that
was to help
visually impaired
or illiterate people
access digital services
more effectively
through voice interactions.
Secondly, we then
have to consider this question of harm.
So, what sort of things
fall into this category?
Well, here,
what we see is that the harms
are often the inverse of
the benefits
that we might try and unlock.
So, instead of improving
human health or wellbeing
technology might
undermine human health
or wellbeing,
including potentially mental health.
It might restrict
human freedom or autonomy,
something that comes to the fore
if we think about the challenge
posed by addictive content.
It might lead to unfair treatment
or outcomes, as we saw in the
case of algorithmic bias.
It might harm public institutions
or civic culture
and it might infringe human rights.
So, if this is the case, it's something
that we really would need
to be mindful of.
Returning again to
the case of WaveNet,
we understood that that research
carried with it certain risks
including voice mimicry and deception,
if individuals used
it to copy each
other's voices,
and it could also potentially
erode public
trust in the fidelity
of audio recordings.
So, once you have an
understanding of these risks,
the question is what can you do
about them?
Is it possible to
mitigate these risks or to
eliminate them entirely?
And in that regard,
there's a number of
significant things we can
think about.
The first is whether it's
possible to control the release
of technologies
or the flow of information.
So, often technologies
only have harmful
outcomes if they fall into the
hands of people who have a
malicious intention or purpose
and we need to think about
whether there's ways to prevent
that from happening.
We might also wonder
if there's technical
solutions and countermeasures
that we can use to make our
technology harder to misuse,
something that Chongli really
delved into,
and in the context
of WaveNet we can think about
things like watermarking
so that we always know
the progeny of a
specific piece of synthetic
audio or detection.
We also have the opportunity
to work with
public organisations
and the public
to communicate certain
messages around technology.
So, sometimes if a new risk has been
created, it's very important to
make people aware of this
although that often isn't
sufficient to discharge our
moral responsibility.
So, finally we may
seek out policy solutions
and legal frameworks that help
contain the risk
and in the case
of the challenges raised by
synthetic media,
civil society groups
have been really, really
important in terms of
coming up
with a united agenda
that helps make sure that these
technologies are used in a safe
and responsible way.
So, once we have our mitigation plan
in place, we have this question
of whether the proposed action
violates a red line
or moral constraint.
These constraints
are sometimes referred to in
moral philosophy as deontology,
they're made up of a set of
rights and duties that mark out
the sorts of things that we
should not do.
For example,
it would be deeply problematic
to develop technologies that
contravene consent protocols
that infringe on people's
personal space in ways that they
haven't consented to.
Similarly, in the domain of
artificial intelligence
there's a concern
about lethal autonomous weapons
and what happens when human
decision makers delegate these
fundamental decisions to machines.
So, it's an international movement
which has also developed around
legislating in this area and
containing that risk
and ensuring that this technology
isn't used in a way that
intentionally harms
or injures human beings.
Thirdly, we might
be concerned about certain forms
of surveillance that also could
have a corroding effect on
public trust
and lead to people
for being victimised and
targeted in certain situations.
So, that would be a really
important avenue to be mindful of.
And then we have
technologies that potentially
infringe human rights or
international law.
And basically if there's
any risks that the technology
will be used in a way that
contravenes one of these
fundamental purposes
or one of these
fundamental principles,
that again is a red flag which
really, really should encourage
us just to go back to the
drawing board and ask the
earlier question what does a
safe, beneficial, and productive
version of this technology look like?
However, on the assumption
that we haven't encountered one
of those hard constraints,
we still have this final question,
which is with these measures in
place to the benefits of proceeding
outweigh the risks of doing so?
And as I mentioned earlier,
when we think about
machine learning research,
it's a tremendous opportunity to do
something really beneficial and
we've actually been afforded a
great opportunity typically by
universities, by research institutions,
to try and use our
talents in a way that's
genuinely helpful.
So we should ask,
is this project something
that we really feel will deliver
the kind of social return that
we care about and that has the
potential to make people better
off in practice?
However, even when we've
got through this process
and come to the
conclusion that we have got a
good way of proceeding,
it's still worth pausing to evaluate
evaluate our findings
and to conduct two further tests.
These tests are
particularly helpful because
they can help us address the
problem of motivated cognition,
i.e. the widespread problem of
unconsciously endorsing
arguments that support
our own interests or our
preferred course of action.
So, first I'd ask have you thought
about all the people who could
be affected by your action?
More precisely,
have you identified these groups?
Do you know who they are?
Have you considered
what would happen if you were
trying to explain your reasoning
or your decision to them
and have you directly sought out
their advice and input?
This test is important
for a number of reasons.
To begin with,
those who are affected by new
technologies typically have a
right to be included in
decisions that affect them.
Moreover, even a process of
imaginary and sympathetic
dialogue can help us guard
against error.
If we'd really struggled
to explain our actions
to someone who is adversely
affected by technology,
then this is often a good reason to
revisit our conclusions and work
hard to identify a solution that
could pass that test.
Secondly,
I think we should ask whether
we've thought seriously about
how our decision might be viewed
in the future.
Is it something that we might
have reason to regret?
Here, we can imagine
someone in the future,
perhaps even our own children
or grandchildren,
asking us why we
chose to act in the way we did.
Why, for example,
did we fly to
so many conferences
when we knew
about climate change and the
harmful impact we were having?
How might it feel if we were to
discover that our technology was
subsequently used in a way that
violated human rights?
If these questions make us feel
uncomfortable, then we have
reason to introspect and
identify that source of discomfort.
Moreover, we should
typically adjust our behavior to
act in ways that minimise the
likelihood of future justified regret.
So now we've had a
chance to look at the
responsibilities of machine
learning researchers and at a
process for evaluating our
decisions and choices.
However, it's also worth pausing to
reconsider where we are now as a
field and what the path ahead
might look like.
The field of machine learning
is changing rapidly,
both in terms of
technical developments and
breakthroughs
and also in terms
of ethical norms and standards.
Increasingly, I believe this
recognition of the following
key ideas.
First, those who design
and develop these technologies
have a responsibility to think
about how it will be used.
This responsibility stems from the
fact that the technology is
powerful and from the fact that
it has moral consequences that
we can observe
and that we're in
a position to effect.
Secondly,
there are concrete steps
and processes that we can put
in place to make sure that this
responsibility is
successfully discharged.
These include processes
like the ones we've
considered today,
that help promote the good
while also
respecting important constraints.
Thirdly,
while it's not possible to
know the consequences of
all our actions,
we are responsible
for what we
can reasonably foresee
and should take steps that are
designed to
bring about positive outcomes
even when this means incurring
certain costs.
Given the power
of machine learning technology
and the attendant
responsibilities of people
working in this field,
good intentions are not enough.
We have an obligation
to try and
understand the impact
that our actions will have
on other people and to act
conscientiously
in light of that information.
Beyond this, we can
pause and ask about the path ahead.
So in this regard, I see
three exciting developments that
we are all part of at the
present moment.
So, to start with,
there's an important new
research agenda that's
developing in this space and
includes a critical focus on
areas such as AI safety,
robustness, fairness,
and accountability.
This technical work
to improve the moral
properties of machine learning
systems requires constant
vigilance and effort,
but as Chongli has shown us,
there are things that we can do
and that we can change to build better
machine learning systems.
Secondly,
we're starting to see
the emergence of new norms and
standards that point towards a
different understanding of what
it means to do machine learning well.
On this view, what is
needed is not only technical
excellence
but also for research
to be done in the right way and
for the right reasons,
limiting the risk of harm
while also
working to create technologies
that benefit everyone.
Finally,
we're seeing the emergence of
new practices which aim to
promote responsible innovation
in machine learning.
These include the
release of model cards that
explain the intended uses and
properties of ML systems.
A proposal from top research labs
to create bias bounties
aimed at discovering bias
in models and datasets that we use,
and a new requirement
from the machine
learning conference NeurIPs
which asks all researchers to
consider the ethics and social
impact with their submissions
and to detail this when we write papers.
While very significant
challenges still remain,
I think that these
developments send a
good sign towards the future.
With the requisite degree of
effort, reflection and
conscientious endeavour,
they will help ensure that machine
learning, as a field,
stays on the right track
and continues to be
an area of study
that we can all be proud of.
Thank you.
