Hannah: I think I was about 7 years old
when I wrote my first line of code.
It was probably something
simple--printing my name to the screen
or a two-dimensional shape
that could be twisted and stretched
by my sequence of carefully
typed letters and digits.
It was an extraordinary feeling,
a first sense of where
the logic-defining power
of the computer
could take us.
But it wasn’t until much later
that I started to come across the term
“artificial intelligence,” or AI.
And wow! What a world
that would open up.
Hassibis: AI holds enormous
promise for the future,
and I think that these are
incredibly exciting times
to sort of be alive
and working in these fields.
We want to kind of understand and
master increasingly complex systems.
AI must be built
responsibly and safely
and used for the benefit
for everyone in society,
and we have to assure
the benefits accrue to everyone.
And I think AI can be
one of the most exciting
and transformative technologies
we’ll ever invent.
Hannah: That is the voice
of Demis Hassibis, the CEO of DeepMind,
the London based artificial
intelligence company.
For Demis, AI will allow us
to create computer systems
that can learn to solve
complex problems by themselves.
In his words, society could use
intelligence to solve everything else.
Cancer, climate change,
language, energy,
in short to advance
scientific discovery.
But just how far-fetched
are these goals?
Can researchers really
crack intelligence
and just how much of an impact
would that really have?
I’m Hannah Fry,
and this is DeepMind: the podcast.
For the past year, I’ve been
at DeepMind HQ in London
for an inside look at
the fascinating world of AI research
and where it’s going.
We will be telling you
the fast-moving story
of the biggest challenges
in artificial intelligence, or AI.
So whether you just want to know more
about where the technology is headed
or want to be inspired
by your own AI journey,
then you’ve come
to the right place.
We will focus on the projects
that scientists, researchers,
and engineers
are actually working on,
how they’re approaching
the science of AI,
and some of the tricky decisions
the whole field is wrestling
with at the moment.
And whilst we’re here, we’ve explored
the rooms full of computer screens
where scientists run
their endless experiments,
the meeting rooms where people write
intricate equations on white boards,
packed to the rafters
with robots;
and the laboratories where banks
of repetitive robe-arms grapple
with piles of plastic bricks.
And we’ve talked to
a huge number of people
to try to understand
what is driving this new frontier.
The voices that you’ll hear
in this podcast
are from the people who are
at the cutting edge of AI
and machine-learning,
and quite a few of them are talking
about their work publicly
for the very first time.
But if we want to solve intelligence,
let’s start with
the fundamental question of AI -
what exactly do we mean
by intelligence?
If we’re trying to make
machines intelligent,
what are we actually aiming for?
Hamrick: This is sort of something
that’s debated a lot in in the AI world
as like,
well, do we want to,
like, have our AI agents act exactly
the same way that people do?
Like, should they be exactly
human like intelligent
or should they just be
intelligent in general?
Hannah: This is Jess Hamrick,
a research scientist at DeepMind.
Her specialism is imagination
and mental simulation.
Hamrick: There’s sort of, I guess, like,
one group of people who like to say
that you know we want to build something
that is just generally intelligent,
that’s really able to solve
a lot of different problems
in the world that humans are not
necessarily able to solve,
that has an intelligence
that’s higher than humans,
so this might be able to solve problems
like how do we cure all diseases?
Like, maybe, like--maybe
an artificial intelligence
might be able to help us
solve this problem.
That’s, you know, something
human society and human civilization
hasn’t yet been able
to accomplish.
But then there’s also
another group of people
who say that it’s really
important for us to build AI
that is similar to human intelligence
at least in some ways.
I would consider myself to be
sort of in the latter group.
Hannah: Why does it need
to be similar to humans?
Hamrick: The reason is because
as we build AI we, as humans,
need to be able to interact with AI
and collaborate with it,
be able to understand
the predictions um that it’s making,
or the recommendations
that it’s making,
and if we build AI in a way
that is maybe we are able to build AI
and it’s generally intelligent
but it acts in a way that’s so alien
to humans that we just can’t
really understand what it’s doing
and I think that actually would be
a really bad scenario to be in
because either it means
that people don’t trust it
and then people are
very unwilling to you know
use the recommendations
of this AI -
maybe it says ‘do this one thing
and this will like cure this disease”
but people don’t understand why
it’s making that recommendation -
maybe we miss out
on a lot of opportunities
to really do
a lot of good in the world.
Hannah: We need our AI to understand
the world in the same way that we do -
it needs to be able
to explain itself to us
so that we can be sure
that we can trust it -
take for instance the story of an AI
that was trained to diagnosis
skin cancer
by looking at photographs of skin
lesions taken by dermatologists.
The algorithm did a good job
of correctly labelling the images,
but the researchers soon discovered
that the AI wasn’t looking at the cancer
at all to make its decision,
it had simply learned that lesions
photographed next to a ruler
are more likely to be malignant -
not exactly trustworthy.
It’s crucially important that
artificial intelligence
is able to grasp
the subtleties of human thought.
We want it to do what we mean it to do
- not just what we say we mean.
But that doesn’t necessarily
imply it needs to think
in exactly the same way
as people do -
there can be drawbacks to trying
to imitate human
or animal brains too closely.
Botvinick: We can get into discussions
about where the strategy can limit you.
Hannah:: This is Matt Botvinick.
Matt is the director of neuroscience
research at DeepMind,
where he draws on his experience
in cognitive neuroscience
and experimental psychology.
Matt believes the human mind
is the inspiration.
But AI research has to take things
further in its own way.
Botvinick: You know the Wright Brothers,
when they solved
the problem of flight --
you know the people like to say,
“Oh, they solved the problem when they
stopped trying to copy birds’ wings” --
which you know in some
technical way might be true,
but they wouldn’t have gotten
to where they were right
if they hadn’t spent
an awful lot of time
and if other people hadn’t spent
an awful lot of time
looking at birds’ wings
and noticing the aerofoil pattern
and thinking about
the dynamics of the air
that flows around an object
with this shape.
So yeah, we do believe
that we can look to the human brain
and the human mind for inspiration
but we also talk about
the when the moment comes where we need
to kind of step away from that
and just build something
that does what we want it to do.
Hannah: So what is the neuroscience
equivalent -
what are the birds’ wings of our brains?
The aspects of our own intelligence
that we can use for inspiration
as we build AI.
Well one area that seems to hold
a lot of promise is memory -
and in particular,
something we all do known as replay.
Botvinick: Replay is a phenomenon
that was discovered in a part
of the the mammalian brain -
the medial temporal lobe,
including the hippocampus
where you see neural activity
that suggests that past experiences
are being replayed.
Especially in in navigation
- for example a rat
will go through some environment
and a particular pattern
of activity will arise
as it goes through
the environment and then later,
if you have electrodes
in the hippocampus,
you can see that
the same pattern of activity,
the same sequences occurring,
suggesting that a memory is
being replayed of that experience.
And that idea now
has a firm place in AI.
Hannah: If you lose your car keys,
you can run your mind through
where you’ve been to work out
where you might have left them.
Well I first went into the kitchen,
I took my coat off in the hallway,
put my bag down on the side,
and [clicking]
oh yeah, they’re in my back pack.
That ability to replay your experiences
and learn from that memory
after the fact
is a key part of what researchers
want AI to be able to do.
Here’s more from Matt -
Botvinick: The way that that’s
implemented in DeepMind’s agents
is it’s not exactly
what you find int he brain -
it wasn’t as if people were trying
to slavishly recreate
the biological mechanisms
but the idea of replay,
which was inspired
by neuroscience came in handy.
Hannah: in 2015, replay played a pivotal
role in a famous DeepMind breakthrough.
The team managed
to build an AI system
that could play arcade classics
to a super human level.
The old Atari games like Space Invaders,
Pong and Break Out.
The AI used something called
deep reinforcement learning,
but behind the scenes, it kept a memory
of moves it made as it played.
And how those moves had impacted
on the final score.
By replaying those memories,
the AI could learn from its experiences,
it could work out what sequences
of moves worked well,
which were mistakes,
and find strategies
that otherwise
wouldn’t have been obvious.
But there’s more to our human memories
than just a giant database of facts -
of course you can remember
the name of the capital of France,
but you might also be able to remember
jumping on the bouncy castle
at your sixth birthday party.
Or the pranks you played
on your last day of school.
This is a phenomenon
called episodic memory
and it’s something that holds
a great deal of promise for AI.
Botvinick: We talk a lot about
something called episodic memory
which is simply the cognitive ability
to retrieve a memory
of something that happened to you.
Before we started recording,
we were joking about like
“What did you have for breakfast?”
Your ability to cast your mind
back to that moment
when you were eating breakfast
and you retrieve that information -
that’s a function that psychologists
and neuroscientists refer
to as episodic memory -
and we have this category both
because psychologists work hard
over decades to fractionate
memory
into particular domains or kinds -
but this is a pretty high level idea
- it’s not like replay.
It’s just hey,
there is such a thing as episodic memory
which is very important
for human intelligence.
Maybe our agents should have
episodic memory - what would that mean?
What would it mean for an artificial
agent to have episodic memory?
Hannah: This is an
intriguing possibility.
And AI that can transport itself
back in time
and recall entire events
and experiences rather than just facts.
When you stop and think about it,
this ability to link one memory
with another it is
an amazing human skill.
And if researchers can get
a better understanding
of how our brains actually do this,
it could be replicated in AI systems
giving them a much greater capacity
for solving novel problems.
Let’s think about how that works
for a moment.
Imagine that every morning
that you see the same man
in his 30s
walking a boisterous Collie,
then one day a white haired lady
who looks like the man,
comes down the street
with the same dog.
With those events stored
as episodes in your mind,
you might immediately make
a series of deductions.
The man and the woman might come
from the same household,
the lady may be the man’s mother,
or another close relative.
Perhaps she’s taking over his role
because he’s ill or busy,
we weave an intricate story
of these strangers
pulling material
from our memories together,
prioritizing some pieces of information
over others to make it coherent.
It’s something that’s been the focus
of recent research
by the neuroscientists here -
a study in September 2018
demonstrated the critical role
of the hippocampus -
that shrimp shaped seat of memory
in the middle of the brain
in weaving together individual
memories to produce new insight.
Jess Hamrick is also looking at
another way that AIs can be made
to respond more flexibly
to new situations.
She takes her inspiration
from a different human ability -
mental simulation.
What you and I might call imagination.
Hamrick: Imagine that you’re
on a beach -
you’re um you’ll have like
this mental picture kind of spring
to mind of you know mine at least
is maybe a sandy beach
or a bright blue ocean,
maybe some palm trees.
Hannah: I’m there - it’s lovely!
Hamrick: Yeah [laughs].
And so this is an example of
of what we would call mental simulation.
It's like we’re mentally simulating
this picture of the beach.
And then you can do things
with that simulation -
so you can imagine adding
other people to your imagination,
you can imagine what would happen
if you like threw a ball,
if you’re playing volleyball
or something like that.
So these sort
of mental simulations
are really interactive
and really rich.
And I think that they underlie
a lot of our human ability
to understand the world
and make predictions about the world -
Hannah: I should pause for a moment here
to explain
what Jess and Matt
mean by an agent here.
It’s a word that’s used
a lot at DeepMind -
remember that when people are talking
about artificial intelligence,
they’re really just talking
about computer code
with the freedom
to make its own decisions.
And an agent is just the noun
that they use
to describe the part
of that code that has agency.
Jess is hoping to build agents
that are flexible enough
to adapt
all manner of environments.
It’s a very grand ambition,
but one with real potential.
To see why,
let’s go back to the arcade
and that game of Space Invaders
- mastered using deep reinforcement
learning to create an agent
called Deep Q Network - or DQN.
Hamrick: DQN was really sort of
an amazing technological feat
because it was able to be trained
to be able to play many,
many different Atari games,
directly from perception, from pixels.
And this is something
that hadn’t been done before.
But the way that DQN also works it
that it really just goes directly
from inputs to outputs -
so takes in the image
of the video game,
and it outputs immediately
what actions should be taken
to maximize the score for that game -
so maybe it’s “move left,”
maybe it’s you know,
“push the trigger to shoot” -
all of these actions are being taken
just to maximize that score.
And the agent doesn’t know
why that action is good -
it only knows this action
will give me a higher score,
and so the agent isn’t able to really
do anything else besides that -
you can’t ask the agent to say
“hide behind one of the pillars
until that pillar is destroyed”
or “destroy all of the incoming
Space Invaders in one line
and none of the other Space Invaders.”
So these are all kinds
of like different tasks
that you could give a human and
and they may be a little bit weird,
but humans would understand
what it means to do this
and that’s because humans have
this ability for mental simulation
to imagine what will happen
if they take different actions -
and so by giving our agents
the ability to imagine things
and also plan according to
different tasks that it might be given,
they are able to act more flexibly
and deal
with these sort of novel situations.
Hannah: But humans aren’t
the only form of intelligence
we can draw inspiration from -
we can also learn from our cousins
in the animal kingdom.
Let’s bring in researcher Greg Wane -
within neuroscience
Greg’s thing is memory
and cognitive architecture.
Wane: one of the things that is
quite clear
is that animals have a remarkable
ability to deal with
for example
very long time scales,
so experiences that can be linked
to cross periods of time,
that is ah way beyond
our current sets of agents.
The great example I think is the
Scrub Jay the Western Scrub Jay -
they bury things.
They prepare for the winter
by scrounging up a lot of food
and putting it into -
depositing it into different places -
hiding it from each other,
and they love to steal each other’s
food too - they are scavengers,
and they can remember thousands of sites
where they’ve buried their food, so -
Hannah: All at once?
Wane: All at once!
And they can even know
detailed facts about it -
they know how long ago
they buried things,
they know if they were being watched
while they were burying things -
they know what thing they buried there -
they have an incredible memory
for these events
that they have produced themselves.
Hannah: How can you tell that
they know what they buried?
Wane: Because they have a preference -
you'll see that they’ll -
they like maggots more than peanuts -
they’ll go back to those maggots first.
Having a kind of large database
of things that you’ve done
and seen um that you can access
and that you can use to um
then guide your goal-directed
behavior later- you know:
“I’m hungry, um I’d love
to have some maggots right now
- Where should I go find those?
[laughs] You know - [laughs]
that’s the kind of thing
we would like to replicate.
Hannah: And there’s another big lesson
we can learn from animals -
if you want to teach a dog to sit,
you don’t write a list of instructions.
Move this muscle, head your leg
45 degrees, anything like that.
Instead you repeat the same task,
over and over again,
offering punishments
and rewards as you go.
Wane: And if it’s good, you give it
a little bit of food um
that’s how we train dogs now.
I have a friend who trains dogs
to um do things on iPads
using ah reinforcement learning,
so we’ve already started on the path
in AI of merging reinforcement
learning very closely with how our AIs
make decisions and so on
and that’s how we train them.
Hannah: So you’re essentially training
an artificial intelligence -
an AI - in the same way
that you might train a dog -
rewarding them for good behavior,
ignoring bad behavior. Very nice!
Wane: that’s right.
Hannah: But okay -
how do you treat an AI -
what does it mean to reward something
that isn’t interested in doggy biscuits?
Here’s Demis Hassabis.
Hassabis: Well with artificial systems -
all they really care about
is ones and zeros -
so you can construct artificial reward
mechanisms for almost anything -
we’ve now moved away
from programming the system solution
so it now learns for itself
to now going up a meta level -
so now what we’re really programming
or designing is rewards systems -
so it’s kind of interesting that that
now is becoming the difficult part
is like how do you
like design curricula,
how do you design breadcrumb trails
and rewards,
so that eventually they learn
the right things these systems -
there’s also the idea
of unsupervised learning
which is how do you learn things
if in the absence of any reward -
and actually that’s the issue
with reward learning
- in the real world -
as humans or even as children
there aren’t very many rewards -
it’s quite sparse the rewards,
even as a dog, right?
The dog gets a doggy biscuit
every now and again
but it has to decide every moment
like what to do and actually
I think one of the answers to that
is what we call intrinsic motivation
which is internal drives
that have come through - in animals -
it has come through evolution -
um but we could also evolve or build in.
Those drives are very strong
and they guide the animal or the system
even in the absence
of external rewards
so of course that might be things like
joy or fear or even things like hunger -
these are all primal
kind of internal motivations
that drive your behavior even in
the absence of any external reward.
Hannah: You’re listening to DeepMind -
the podcast. A window on AI research.
While rewards might be a key part
of how to encourage AI to learn,
one of the main aims of machine learning
is for AI to be able to teach itself -
to notice patterns
and short cuts between tasks
and make themselves
more efficient learners.
In an ideal world, engineers would like
to reach a point
where AI can learn
in a similar way to humans -
picking up the essentials of a new task
in a matter of minutes.
Back to Matt Botvinick.
Botvinick: An example would be -
I went on holiday recently
to South America
and I wanted to brush up my Spanish
and I knew exactly how to do that -
I knew what resources were out there
for it to begin with,
but more importantly,
when I sat down to brush up my Spanish,
I had a whole repertoire of concepts
that like really guided me -
like I know what it means
to conjugate a verb, right?
I know that in certain languages
there are masculine and feminine forms,
so this background knowledge helped me
learn much more rapidly
than if I just sort of was dumped into
the middle of you know a new language
without understanding what it means
to learn a language and we want systems,
we want artificial systems
that come armed with these concepts.
It’s not just about language -
it could be video games -
we could sit down in front of a new
video game that you’ve never played,
but if you’ve played video games
in the past,
you kind of know how video games work,
and that helps you to learn rapidly.
Hannah: The AI is what’s known
as narrow in its focus,
now that might be diagnosing cancer
or playing video games,
but the ultimate goal is to create
something much more powerful -
something called artificial
general intelligence
with precisely this ability of being
able to adapt to different situations -
to be able to use the high level
concepts its earned in one environment
and apply them in another.
Botvinick: We don’t want just a system
that’s really good at one thing,
we want a system that’s really good
at lots of things,
but really what we mean
is we want a system
that can pick up new tasks
that it’s never performed before.
We want an intelligence
where you can say -
okay, you’ve never solved
this kind of problem before
but let me let me tell you
what I want you to think about now.
And you could introduce them
to um organic chemistry or something
and then would be able to work
with that - humans can do this!
Hannah: But getting machines to do this
is really, really tricky.
And it’s not the only thing that we
humans can do that AI finds hard.
Greg spends a lot of his time
trying to understand
the details mental processes
behind apparently simple human tasks.
Botvinick: You have breakfast, and you
drink your orange juice and you run out.
And then you think to yourself -
god when I leave work,
I’m going to
have to pick up some orange juice.
You go through your work day
and you don’t even think
about orange juice once,
and then it spring to mind you know
immediately as you’re leaving the office
that you need to go pick up
some orange juice.
When you’re going to buy
the orange juice
it is actually of no value
to your present self -
the only self that will benefit
from buying the office
is yourself at breakfast the next day.
So you actually have to do something
that ah incredibly prospective
- or thinking forward thinking about
the context of your Future Self. Um -
Hannah: So this is something that I mean
people here really sit around
and sort of talk about and kind of work
out what is it about your brain
that reminds you to buy orange juice
at the right moment?
Botvinick: Yes, because you can easily
construct virtual environments
with tasks for agents that we normally
have that have properties like this
like thinking minutes or hours ahead,
or remembering something from hours ago
that our normal agents
completely stumble on.
They cannot do. Why is that?
They seem easy! [laughs].
It seems easy to buy orange juice
[laughs].
Hannah: There’s a theme emerging here -
back in the 1980s Hans Moravec
and his colleagues pointed out
that when it comes to
artificial intelligence,
everything is a little bit upside down.
One of the things that humans find tough
like maths and chess and data
crunching require
very little computation.
The things that we humans manage
without even thinking turn out
to be monumentally
difficult for machines.
It’s a phenomenon that’s become
known as Moravec’s paradox.
Botvinick: Like other neuroscientists
and psychologists here,
I find myself thinking about stuff
that seems really simple -
stuff that I do and other people
do really without thinking about it
and it just doesn’t seem
that big a deal,
but it turns out to be very -
those some of those things turn out
to be very difficult
to engineer into artificial systems.
So picking things up,
putting things down,
planning a route through a building -
things that we can just do
without really much mental effort
sometimes prove to be
quite difficult to engineer.
An example of this just came up
as we walked into this room
we all realized
that it was quite stuffy in here
and that we wanted
to try to cool it down
so we all huddled around the thermostat
and we tried to figure out
how to get it to do what we wanted,
and it seemed to be resistant,
and at some moment I thought -
wait a minute - maybe,
maybe the air conditioning’s
just broken.
And again, that seems like
a super simple thing -
like you know, what’s such a big deal
about that but actually in AI research,
we have a name for this
and it’s latent state inference -
we’re trying to infer some aspect
of what’s going on which
is latent or hidden.
And it turns out in order to do
that seemingly simple thing,
you need a very rich model
of the world -
you need to understand air
conditioners and thermostats,
and what it means to be broken
and what’s the probability
that it’s broken and so forth.
Hannah: Moravec’s paradox
is often talked about
as some kind of profound mystery.
It’s used as evidence that
while the jobs of analysts
and lawyers might be at risk
in an age of AI,
gardeners, receptionists and cooks
are secure in their careers
for decades to come.
But DeepMind’s founder Demis Hassabis
has quite a different take.
Hassabis: I think it’s quite obvious -
there’s actually
a simple explanation for it -
when Moravec was doing AI,
the dominant
paradigm was expert systems -
so handcrafting solutions
directly to AI problems think of it
as building big databases of rules -
of course if you are going to do that,
that in itself is a very explicit task -
programming that out -
you know you have to know exactly
what you want to write
and what rules you want to incorporate
and what that means is the only task
you can do that
with are the ones that you explicitly
know how to do as humans yourself
and that’s things like that are logical
based like maths and chess
so weirdly the things
that we do intuitively ourselves
and effortlessly like walking and seeing
and you know all of these
sort of sensory motor skills
seems effortless to us
and the reason is
because there’s actually
huge amounts of brain
processing going into that
it’s just that it’s subconscious -
it’s areas of the brain
we don’t have conscious access to -
we probably knew less
about neuroscience at the time
so we didn’t realize quite
how much processing goes on
in the visual cortex for example
and so now we know
both of those things -
we know how the brain works better
and we built learning systems
like AlphaZero and AlphaGo
so it turns out actually vision
is not any more difficult
really than playing Go -
it’s similar if you approach it
in the same way.
Hannah: It’s almost impossible to
reverse engineer our unconscious skills
using the old methods of hand
crafted programming -
you have to have a total
and complete conscious understanding
of how something worked before you
could ask a computer to replicate it.
But now the machines are just
beginning to mimic
our subconscious processes
like vision and pattern recognition -
there’s no reason why Moravec’s paradox
needs to necessarily
be a barrier in the future.
Hannah: I have to be honest with you,
this single idea more than any
I’ve learned in making this series
is the one that hit home and underlines
the power and potential of AI for me -
all that we managed so far
and everything
that we created with machines
are only the things that we consciously
know how to order them to perform.
We’re only just at the very beginning
of artificially
mimicking our subconscious
processes too.
And that means that there is
an extremely exciting journey ahead.
But this partnership
of studying neuroscience
and artificial intelligence
alongside one another
doesn’t just help
make our AI better.
Here’s Matt Botvinick and Jess Hamrick
again to explain.
Botvinick: We often talk here
about the virtuous cycle -
the opposite of a vicious cycle, right?
There’s a virtuous cycle
between AI and neuroscience
where neuroscience helps AI along,
and then AI returns the favor.
Hamrick: One of the reasons
why we can get this virtuous cycle
between neuroscience
and cognitive science and AI
is because fundamentally
we are all trying to study
the same thing
which is intelligence,
and so if we ask sort of
these more abstract questions
about what should an intelligent system
do in this situation,
we can ask that about humans -
what would a person do in this situation
and try to come up with an answer,
we could ask what should our AI agent
do in this situation,
and try to come up with an answer,
or if we have an answer already
and one of those fields
we can take the solution
and apply it to one of the other fields,
and I think that sort of really
what enables this this ability
to transfer
between the different fields.
Hannah: This isn’t just a theoretical
flow of ideas.
There are real examples of ideas
from artificial intelligence
finding their way back
into neuroscience.
Botvinick: So there’s a neurotransmitter
- a chemical
that conveys messages
in the brain called dopamine.
In the 1990s people were finding ways
of tracking the release of dopamine
in the brain and very clear patterns
were being identified
but nobody really understand
what they meant -
why does the brain release dopamine in
this situation and not that situation?
And as I understand the history
some papers hit the desk of some people
who were studying computational
reinforcement learning -
people like Peter Diane
and Reed Montague
and they just saw immediately
that the patterns of activity
that were being reported
in these neuroscience papers -
the dopamine data
could be explained
by the math that’s involved
in reinforcement learning.
That has led to a real revolution
in the neuroscience of learning.
Hannah: if you give a monkey a treat,
they get a little hit
of dopamine in their brains.
It’s the same in our brains, too.
A little burst of pleasure
whenever something good happens.
But in the 1990s researchers realized
that dopamine
wasn’t actually the response
to the reward -
it was reporting back
about the different
between what the monkey
expected the reward to be
and what it actually received.
If you’re walking down the road
and you unexpectedly find a £20 note,
it’s much more exciting than
if you’re collecting a £20 note
that’s owed to you by a friend,
and if a monkey is expecting you
to give it a grape
and you hand it a piece of cucumber
it’s going to be a lot less happy
than if you just surprised it
with a bit of cucumber from nowhere.
The thing is - AI researchers were
already using something
that acted in a very similar way
in their algorithms -
they’d get their agents
to make a prediction
about what was going to happen next
and compare it
to what actually occurred.
But remember - in all of this,
the idea is to just take inspiration
from the way that our human brains work.
Not to make a straightforward copy
because our brains
aren’t exactly perfect.
So we’ve heard how we can take
inspiration from the human brain,
from the animal and even the bird brain
to create AI systems,
but this isn’t just
a working theory anymore.
Researchers aren’t just talking
about what they want to do,
they’re also talking about
what they’ve actually managed to do.
Let me tease you with Koray Kavukcuoglu,
Director of Research at DeepMind.
Kavukcuoglu: It’s a simple problem.
Of course you can write a program
to solve that.
But the idea is try to do
deep reinforcement learning,
try to come up with a system
that we think
can generalize to different problems
- to more problems,
and once we solve that,
it was a matter of weeks we had 10 or 15
Atari games being solved.
Hannah: If you would like to find out
more about the link
between AI and the brain,
or explore the world of
AI research beyond Deep
Mind, you’ll find plenty of useful links
in the show notes for each episode.
And if there are stories or resources
that you think other listeners
would find helpful, then let us know.
You can message us on Twitter or email
the team at podcasts@DeepMind.com.
You can also use that address
to send us your questions
or feedback on the series.
Let us take a little breather -
see you shortly!
