[MUSIC PLAYING]
[APPLAUSE]
JUAN SILVEIRA: Hello, everyone.
My name is Juan Silveira.
I'm a software engineer in
the DeepMind Health team,
and I'm here to tell you a
little bit about DeepMind
and the work that
we've been doing.
So let me start by
telling you who we are.
So DeepMind is a British
artificial intelligence company
that was founded in
2010 and joined forces
with Google in 2014.
We're a bit of an unusual
company in that we
have a two-part mission.
The first part of
that mission is
to create general
artificial intelligence.
But what do we mean by general
artificial intelligence?
And the key here is being able
to create agents' algorithms
that can work and do well
in a variety of tasks
and environments.
And the key to
that is to get them
to learn from their
own experience,
so not to preprogram
them with behaviors
but to actually allow them
to learn from what they
can see in their environment.
So that's the first
part of our mission.
But we also have a
second part, which
is to use artificial general
intelligence to address
pressing social challenges.
So we recognize the
potential of AGI
to really make a
change in the world.
We want to make sure that the
benefits of that reach everyone
and that we can use
it to address things
that we all, as a society,
have to deal with.
So how are we doing that?
So there's our mission.
How are we actually formed?
We're at heart a
research organization.
We're based in London,
in Kings Cross,
and we have more than 300
of the best AI research
scientists in the world,
all here in Europe.
Like any research organization,
we do our research in the open.
So we actually publish over 100
papers in different research
organizations.
And we joined forces
with Google in 2014,
which means that we could
really take our speed up
to the next level.
And the mixture of
these two things
means that we're a bit
of a hybrid organization.
On the one hand, we have
the very long-term focus
of academia because AGI
is not something that's
going to happen anytime soon,
but we're very, very focused
on making that happen.
But at the same time, we
have the pace, the scale,
and the agility of a
very well-funded startup.
And because of the second
part of our mission,
we also have a lot of
people in our organization
that have the kind of
social impact values
that you only find generally
in the public sector.
Now, I talked about
AGI, artificial general
intelligence.
But how is it that we're
trying to resolve this?
And we're trying to resolve
this by building general purpose
learning systems.
And this, just to
give you an idea,
involves creating agents that
interact with an environment.
So these agents get signals
and make observations
of the environment
and decide on actions
to take that then get
reflected in that environment.
And once they've
taken those actions,
they can actually observe again
how it affects the environment.
But the key thing is that they
have a goal, so a function that
tells them whether
they're closer
to actually achieving what it
is that they're trying to do.
And this whole process is called
Deep Reinforcement Learning.
I'm sure some of you
have heard about it.
And it's key to what
we're doing at DeepMind.
The environment that
we use for this is key,
and we've actually
had quite a bit
of success using the old
Atari games as a testbed,
as an environment in
which to train agents.
So we've taken over a hundred
of the classic Atari games,
so this includes things like
Pong, Frogger, Montezuma's
Revenge Breakout--
some games that I'm sure
many of you have played.
And we've created an environment
in which what the agent sees,
the inputs are just the
pixels on the screen.
So it just gets that
matrix of pixels.
For RGB values, that is what
you would see on the screen.
And the outputs are the
controllers in the Atari game,
which are very simple controls.
So you could go up, down, left,
right, and that trigger button.
But the really important
thing is that the agent is not
told what to do.
So it's not told the rules
of the different games.
It's not told how it works,
the internal functioning
of the Atari game.
So any knowledge needs to
be learned from scratch
with serial preprogramming,
like no strategies for how
to play with different games.
What we give it is the goal, and
that is to maximize the score.
But the key thing that
I think is fantastic
when I first learned about
this was that one agent plays
all of the games.
So you don't train one
agent for each game.
You train one agent
in all of the games
and have it play
all of the games.
So a good analogy
of how this works
is that you could have a
little robot in an arcade
sitting in front of the
game, looking at the screen,
and moving the controller,
and that's all.
It doesn't have any knowledge of
the internals of how it works.
So let me show
you how this looks
like as the agent is being
trained and how it evolves.
So this is the game of Breakout.
I'm sure many of you have
played-- how many of you
have played Breakout before?
Great.
I won't explain it then.
So the agent starts training.
After half an hour, it doesn't
really know what's happening,
so it just loses quite quickly.
Sometimes it manages to
bounce the ball back,
but mostly just
loses quite quickly.
After an hour of training,
gets a little bit better.
It maybe survives two
bounces, but that's about it.
It actually still
loses quite quickly.
But after two hours,
it's now doing
how you would do probably if
you started playing Breakout
right now.
So it runs around,
bounces a ball back,
and manages a basic
level of competence.
But the great thing is
what happened unexpectedly
after we let it
play for four hours,
and the agent
started doing this.
So for those of you
who have actually
played Breakout
for some time, you
know that this is
the winning strategy.
You break the sides and you
push the ball at the top,
and it just does
the work for you.
But this is not
something that was
preprogrammed on the system.
It's not something that
we told the agent about.
It's something that the agent
discovered by playing the game
thousands and thousands of
times with this goal function
of improving the score.
And this is the power of
Deep Reinforcement Learning,
that it allows this kind
of behavior so much.
So we used the Atari games
for a long time for research
and we're still
using them today.
And we published papers
about this research,
and it was quite successful.
But a few years ago, there
is a team on DeepMind
that started looking
for a new challenge.
And that challenge
was the game of Go.
Let's see if I can get it.
So for those of you
that don't know Go,
it's a very ancient
game, originated in China
about 3,000 years ago.
And it's considered by many
to be more than just a game.
People think about it as
a spiritual dimension,
as poetry or art.
It was considered to
be one of the four arts
that were key to be
mastered by a true scholar.
It is hugely popular in many
places, especially in Asia.
There's 40 million players,
2,000 professional players,
and there's Go schools
throughout Japan, South Korea,
and China.
And the thing about Go,
why it's a great challenge
is because it has
very simple rules.
You could learn the rules of
Go in a couple of minutes,
but from those rules
actually arises
a huge, profound complexity.
Just to give you an
idea of the scale,
there's an estimate
that there's about 10
to the 80 number of
atoms in the universe,
but there are 10 to
the 170 possible board
configurations in Go.
So it's impossible to
play Go by brute forcing
or by trying to explore all
of that possibility space
with no good strategy.
And that means that
Go has been one
of the great challenges
for AI because no computer
program could beat a human
professional Go player.
If you compared this to
chess, which is something
that computer programs have been
doing very well for a while,
the branching
factor of the number
of possible moves at any
given time in chess is 20,
and in Go it's 200.
So a team in DeepMind
started creating
an artificial intelligence
program that's called AlphaGo.
And AlphaGo is actually
composed of two neural networks.
So you have the Policy
Network and the Value Network.
And the Policy
Network was initially
trained using human games,
so we took thousands of games
from an online Go Server
that we had access to,
and trained this network
by watching those games
and then trained it further
by playing against itself.
And the objective of
the Policy Network
is to give any
particular position,
a particular
configuration of the board
to say which positions are
most likely to be played next,
and that allows you to narrow
down that branching factor.
The other network, the Value
Network takes the configuration
of the board, and what
it tries to estimate
is how likely is
white or black to win.
So instead of having
to actually play out
the entirety of the game
to try to figure out
who is most likely
to win, this allows
us to cut down how deep you
need to go in order to explore
the possibilities of the game.
So a combination of
these two networks
is what made
AlphaGo so powerful.
But let me tell you a little
bit how the progression of this
went.
After creating this
network and training it,
the AlphaGo team
actually managed
to get a 100% success rate
against the best computer
programs, but we know
that they were not
as strong as even the level-one
professional Go players.
So we got a professional
Go player, Fun Hui,
who is the three-time European
Go champion and a two dan
professional player.
And he came to the
DeepMind office
to play a five-game
match against AlphaGo.
And AlphaGo actually won 5-0,
which was completely unexpected
because it was the first
time a computer program could
beat a professional player.
And after that, we wanted
to see how much further it
could actually go.
So we actually scheduled a
match in Korea, in Seoul,
against Lee Sedol.
And Lee Sedol, for those
that don't know Go,
is actually a legend
in the world of Go.
He has won more
international competitions
than I think only
one other person
and is considered by
many to be the best
Go player of the last decade.
So the AlphaGo
team went to Korea,
played on this match,
which was a huge event.
It was everywhere in the
newspapers and the TV.
It was watched by more people
than actually watched the Super
Bowl in the US.
And the fantastic thing to
me about these matches--
once again, five matches--
it's not just the end
result, but the fact
that on the second game, AlphaGo
made a move, Move 37, that
had a [? one ?] in 10,000
chance of being played,
according to the Policy Network.
So it was a very, very
super unlikely move, took
the commentators by surprise.
Many thought it was
actually a mistake.
But actually turned
the game, and it was--
Fan Hui, commentating
on that game
said that it was actually a
beautiful, a beautiful move.
And then in game 4, the
same thing happened,
but Lee Sedol made
a move, once again,
that AlphaGo didn't expect
because the Policy Network will
say they had a 1 in 10,000
chance of being played.
And people were
talking about that move
as a touch of God, a beautiful,
beautiful move again.
So the fantastic
thing about these
matches to me is not so much
the result, but the fact
that the computer and the human
professional Go player playing
together actually reach
new levels of play
and could actually come out
with these beautiful games
by playing together.
After this, we
actually had further Go
summits, the Future of Go Summit
precisely to explore this,
to explore how AlphaGo and human
players could play together,
to bring the art of
Go to the next level.
So they're both--
I work with Atari and
I work with AlphaGo.
It was done in the
open, like I mentioned,
so we publish papers
for both for these
and many other things.
We're very lucky that
those two papers actually
got onto the cover of "Nature,"
which within 18 months,
it was the first time for
an artificial intelligence
research organization.
And let me tell you
about some of the things
that we've done since then.
So these are two
projects I think
I'm particularly excited about.
So one of them was
lip reading sentences,
which, once again, managed to
push the boundary from being
able to recognize
particular words,
like a limited vocabulary to
just being able to recognize
and have good results on
unbounded natural language,
so being able to
recognize any speech.
And there's also WaveNet.
WaveNet, if you
never heard about it,
I really encourage
you to go and see
the blog post on
the DeepMind website
where you can
actually hear samples
that were generated by WaveNet
and are incredibly natural.
It really changed the way
that we do text to speech
because up to now
text to speech will
stand by taking recordings
of a professional actor,
of particular sounds
and then putting them
back together to form a phrase.
But WaveNet actually generates--
if you know about how
digital audio encode those--
16,000 samples per second.
And what WaveNet does,
the neural network does
is generates each
of those samples
a particular step,
so it actually
generates the audio wave.
And the amazing thing
is that I managed
to cut the gap between the
best text to speech systems
available and a
human voice by 50%.
So we're halfway
closer to reaching
the level of a human speaking,
generating natural speech.
And the great
thing about this is
that I think it really shows
how these kind of algorithms--
how reinforcement learning can
be adapted to new environments,
new domains really easily
because they are more generic,
because they're a step closer
to general intelligence.
So that is the work that our
research arm has been doing,
but let me tell you also
about how we're applying this.
So this is a photo of
our Google Data Center,
and in particular it's the
cooling area of the Google Data
Center.
They're very colorful
pipes actually coded
for different water
flows to different areas.
So as you can imagine,
our Google Data Center
is already a very, very
optimized data center.
It's some of the most efficient
data centers in the industry.
And after DeepMind joined
forces with Google,
it was natural to start working
with some of the Google teams.
So a team from DeepMind started
working with the Google Data
Center team to
create an agent that
would have as inputs the sensor
readings from the data center,
so things like the temperature,
the weather outside,
the pressure in the cooling
system, how much power
was being drawn by the servers.
And the output was the
configuration parameters
for the cooling system, things
like the water flow pressure,
refined pressure.
And the result of this--
of applying this agent to this
very optimized data center
was a 50% improved overall
efficiency over the entire data
center, which was a
fantastic improvement
in that system that was already
being heavily optimized.
So we think this
shows how AI can
be applied to
real-world problems,
for example, to
reduce energy usage
and reduce carbon emissions.
The other applied project that
I'd like to tell you a little
about is DeepMind Health.
DeepMind Health is
a project I work on.
And it was launched
in February of 2016
as our first external
facing applied project,
and we're really
excited about Health
because we're excited
about the potential
for digital technology
combined with better data
to support clinicians
in transforming
the way they do patient care.
We've grown and we're now a team
of 70 clinicians and engineers
and designers.
But as we started to go
into the health space,
the first thing we did
is we spent time on wards
in hospitals, and
we realized how much
was needed still to
be done to deliver
the dream of the
paperless hospital.
So things like pagers
or paper or pen
were still very much part
of the day-to-day life
for nurses and doctors.
And you can contrast this what
your day-to-day life is like.
So because of the services
that all of us creates,
you have the information that
you need when you need it
at any time in your pocket.
So the first step
towards delivering
AI-enabled technology is to
make sure the doctors and nurses
have the latest digital tools.
So we created this
project called Streams.
And Streams is a
mobile app that doesn't
have artificial
intelligence built in.
And what we aim to do with
it is to support clinicians
in their existing
workflows by, for example,
sending alerts to
a specialist when
we detect that a
patient is deteriorating
and then making
sure that they have
access to the relevant
information when they need it.
So the idea here is
to help clinicians
get patients from their tests
to their treatment in a matter
of minutes instead of hours.
And at the same time,
in parallel to this,
we're also using
artificial intelligence
to help with cutting-edge
medical research.
And there's a couple
of projects that we're
working on here, one of
them with the Moorfields Eye
Hospital, which we're
working on retinal scans that
can detect diabetes related
or macular degeneration.
And the thing about
this area, why
it's so exciting to
be working on this
is even though
there's a 10 to 5--
if you have diabetes, 10
to 5 times more likely
to suffer sight loss,
98% of the sight loss
is preventable if it's
detected early enough.
And we hope that we can
help with that detection.
We're also working with the
University College London
Hospital on radiotherapy
planning for head and neck
cancers.
So head and neck
cancer treatment
are very tricky because there's
a lot of things going on there
and the treatment needs to be
planned very, very accurately.
That means that in order
to treat one patient,
a radiotherapist needs
to spend about four hours
to plan each of the treatments.
And there's already
a huge shortage
of radiotherapists in the UK.
So we hope that
we can reduce this
from a very, very lengthy task
to a much shorter verification
task.
We have a lot more information
about this on our web site,
deepmind.com/health.
And if you have any
questions about all this,
send us an email,
sayhi@deepmindhealth.com.
And that's all I had for today.
If you have any
questions about this
and you want to talk
about any of that,
find me after the talk,
send me a message.
And if you're interested
in what we're doing,
also check out our website,
deepmind.com/careers.
Thank you.
[APPLAUSE]
[MUSIC PLAYING]
