[MUSIC PLAYING]
JAMES MANYIKA: Stuart, as
you think about, you know,
all these technologies,
and there's
enormous bounty and economic and
societal benefit and potential.
But at the same time if we do
achieve these breakthroughs,
what could go wrong?
STUART RUSSELL: Well--
JAMES MANYIKA: I can
imagine all the things
that would go well, but what--
STUART RUSSELL: Yeah,
so making machines
that are much more
intelligent than you,
what could possibly go wrong?
JAMES MANYIKA: Right.
STUART RUSSELL:
Well, this thought
hasn't escaped people over time.
So Alan Turing in 1951
raised the same question.
And he said we
would have to expect
the machines to take control.
And he actually referred
to an earlier book.
So he says in the
manner described
in Samuel Butler's Erewhon--
JAMES MANYIKA: Oh, that's right
STUART RUSSELL: --which
was written in 1863,
so fairly soon after
Babbage actually
developed a universal
computing device,
although he never built it.
But Babbage and Ada
Lovelace speculated
about the use of this machine
for intellectual tasks.
So the idea was clearly there.
In Erewhon, what
Samuel Butler describes
is a society that
has made a decision--
that they have gone
through this debate
between the pro-machinists
and the anti-machinists.
The anti-machinists are
saying, look, these machines
are getting more and more and
more sophisticated and capable,
and our bondage will
creep on us unawares.
We will become subservient
to the machines
and eventually be discarded.
But if that was the only form
of argument that says, look,
smarter thing, disaster,
you might say, OK, then
we better stop, but we would
need a lot more evidence.
And also you'd lose then
the golden age benefits.
All the upside would disappear.
So I think to have any
impact on this story,
you have to understand
why do we lose control?
The reason actually
lies in the way we've
defined AI in the first place.
So our definition of
AI that we have worked
with since the beginning
is that machines
are intelligent to
the extent that they
act in furtherance of their
own objectives, right--
that their actions
could be expected
to achieve their objectives.
JAMES MANYIKA: Objectives
that we give them, presumably.
STUART RUSSELL: Yeah, so
we borrowed that notion
from human beings.
We're intelligent to the
extent that our actions
achieve our objectives.
And that in turn was borrowed
from philosophy and economics--
the notion of rational
choice, rational behavior.
And we just said, OK, let's
just apply it to machines.
And of course, we have
objectives and machines
don't intrinsically
have objectives.
So we plug-in the
objective, and then you've
got an intelligent machine
pursuing its objective.
And that's the way we've
done AI since the beginning.
So it's a bad model because
it's only of benefit to us
if we state the objective
completely and correctly.
And it turns out that that's
not possible in general.
We've actually known this
for thousands of years,
that you can't get it right.
You know, King Midas
said, I want everything
I touch to turn to gold.
Well, he got exactly
what he wanted,
including his food, and
his drink, and his family.
And he dies in misery
and starvation.
You know, all the stories--
when you rub a lamp
and the genie comes up,
what's the third wish?
Please, please
undo the first two
wishes because I
ruined everything.
JAMES MANYIKA: You undo
what you've just done.
STUART RUSSELL: Even if
the machine understands
the full extent of
our preferences,
which I think is sort
of impossible because we
don't understand them--
we don't know how
we're going to feel
about some future experience.
In the standard
model, once you've
plugged in the
objective, certainly you
may find solutions
that you didn't
think of that end up
tweaking part of the world
that it never occurred to you.
And so the upshot of all
this is that the best way
to lose control is to
continue developing
the capabilities of AI within
the standard model, where we
give machines fixed objectives.
The solution is to have a
different definition of AI.
In fact, we don't really
want intelligent machines
in the sense of machines
that pursue objectives
that they contain.
What we want are machines
that are beneficial to us.
So it's sort of this
binary relationship.
It's not a unary
property of the machine.
It's a property of
the system composed
of the machine and us--
that we are better
off in that system
than without the machine.
[MUSIC PLAYING]
