MODERATOR: Today with us we
have Professor Nick Bostrom.
He was born in
Helsingbrg in Sweden.
He's a philosopher
at St. Cross College
at the University of Oxford.
He's known for his work
on existential risk,
the entropic principle,
human enhancement ethics,
the reversal test,
and consequentialism.
He holds a Ph.D. from the
London School of Economics,
and he is the founding director
of both the Future of Humanity
Institute and the Oxford
Martin Programme on the Impacts
of the Future on Technology.
He's the author of
over 200 publications,
including the book we
are presenting today,
"Superintelligence."
And he has been awarded
the Eugene R. Gannon Award
and has been listed in "Foreign
Policy"'s Top 100 Global
Thinkers list.
Please join me in welcoming
Professor Nick Bostrom.
[APPLAUSE]
NICK BOSTROM: Yeah, great.
Thanks you for coming.
And I'm not going to try to
summarize the entire book,
but I want to give
some of the background
from which this work emerges.
So I run this thing called The
Future of Humanity Institute,
which has a very sort
of ambitious name.
It's a small
research center where
mathematicians, philosophers,
and scientists are trying
to think through the really
big-picture questions
for humanity, ones that
often traditionally
have been relegated to
crackpots, relegated
to journalists or
retired physicists
to write some popular book.
But questions that are
actually extremely important
and, I think, deserve
close attention.
So to give some sense of
where I'm coming from,
one can think about
the human condition,
in grand schematic terms, on
a diagram like this, where
we plot time on the
x-axis, and then
on the other axis, some
measure of capability,
like level of
technological advancement.
Measure of the overall
economic productivity
that we have at some
given point in time.
And what we take to be the
normal human condition--
the idea that you
wake up in the morning
and then commute to work and
sit in front of a screen,
and you're having too much
rather than too little to eat,
is a narrow band within
this much larger space
of possibilities.
It's obviously an anomaly
on evolutionary time scales.
The human species is fairly
young on this planet.
It's also an anomaly on
just historical time scales.
For most of human history, we
were inhabiting a Malthusian
state, and it's really only
in the last few hundred years
that we've kind of soared up.
And even then, just in
some parts of the world.
It's also a huge
anomaly, obviously,
in space, the little crust
of [INAUDIBLE] planet
being very different from
most of the stuff around us,
which is just a
ultra-high vacuum.
And yet we tend
to think that this
is the normal way
for things to be,
that any claim that things
might be very different
is a radical claim that needs
some extraordinary evidence.
Yet it's possible,
if we reflect on it,
that the longer the time
scale we're considering,
the greater the
probability that we
will exit this human
condition in either one of two
ways, downwards or upwards.
This picture has two
attractor states.
So if we exit the
human condition
in the downwards direction--
there is in population biology
the concept of a minimum
viable population size.
With too few individuals left,
they can't sustain themselves.
There's an attractor state down
there, which is extinction.
Once you're extinct, you
tend to stay extinct.
And more than 99.9%
of all the species
that once flew, crawled, or
swam on this planet are extinct,
so that certainly is
one possible future.
Another way that the
human condition could end
would be that we exit in
the upwards direction.
And there, too, I think that
there is an attractor state.
If and when a
civilization manages
to obtain technological
maturity-- meaning,
we have developed most
of all technologies that
are physically possible
to be developed,
and we have the ability
to spread through space
in a reasonable way, through
automated self-replicating
colonization probes--
then the destiny
might be pretty well set.
The level of existential
risk will go down,
and maybe we could
continue on like that
for millions and billions
of years, just growing
at the significant fraction
of the speed of light
indefinitely, until the
cosmological expansion makes
it impossible to reach
any further resources.
Like if something is
too far away today, then
by the time we would
get there, as it were,
it has moved further away.
So there's a finite
bubble of stuff
that something starting
from what we are
could, in principle, get.
So that whole bubble
of stuff, I call
humanity's cosmic endowment.
There is a lot of it there.
And that might be another
possible attractor state.
If one has to view that--
that this stuff that
could, in principle,
be reached there
is very important, if one has
some kind of view on ethics,
where fundamentally, the moral
significance of an experience
or a life does not depend on
when it is taking place-- just
as many people have the belief
that from moral point of view,
it doesn't matter
where it takes place.
Like if you travel to
Africa and you suffer there,
it's as bad as if you
were suffering here.
If one has that view about
time, then this cosmic endowment
matters a lot.
Because it could
be used to create
an enormous amount of value.
We can count it up, roughly.
We know that there are billions
of galaxies, each with billions
of stars, and each
of those stars
could have billions of
people living around it
for billions of years.
And you get an enormous
number of orders of magnitude
if you try to measure
the size of the future,
even just assuming biological
instantiations of minds.
If you're imagine
a more efficient
digital implementation, you
can add another big chunk
of orders of magnitude.
And so what you find fairly
robustly, if you just
work the numbers, is that if
you have this evaluative view,
some broadly-aggregated
consequentialist
view, then even a very,
very small reduction
in the net level
of existential risk
will be worth more, in
expected utility terms,
than any interventions you
could do that would only
have local effect here.
Even something as
wonderful as, like,
curing cancer or
eliminating world hunger
would really be, from this
evaluative perspective,
insignificant
compared to reducing
the level of existential
risk by, say, 1/100th
of one percentage point.
So this level of
existential risk
becomes, then, maybe
an important lens
through which to look
at global priorities.
I define it as a risk
that either threatens
the survival of
Earth-originating intelligent
life, or threatens to
permanently and drastically
destroy our potential for
desirable development.
And now I think that maybe in a
complete accounting of ethics,
there are other variables, as
well, to take into account.
We have particular
obligations to people
that are near and dear to us.
There might be other
things in addition
to this sort of aggregated
consequentialist component.
Nevertheless, I think it's
in there and it's important.
So if one then tries to look
more carefully at this category
of existential risk-- like,
what could actually go wrong
in this way?
What could permanently
destroy our future?
It's a very small subset of all
the things that can go wrong.
Like most things that threaten
human welfare don't really
create any existential risk.
So it kind of narrows
down the range
of concerns quite significantly.
A first distinction that
is obvious in this field
is the distinction between risks
arising from nature and risks
arising in some way
from human activity.
And a fairly robust
result, I think,
is that all the big
existential risks,
at least if we are thinking a
time scale of 100 years or so,
are anthropomorphic, arising
from human activities.
And one can see that
just by reflecting
that the human
species has already
been around for 100,000 years.
So if firestorms and
earthquakes and asteroids
haven't done us in in the
last 100,000 years, probably
not going to do us in
in the next 100 years.
Whereas we will be introducing
entirely new kinds of hazards
into the world that we have
no track record of surviving.
And more specifically, I think
all the really big existential
risks are related to certain
anticipated future technologies
that we might well develop
over the coming decades or 100
years.
And another way
to-- another framing
that makes a similar
point is to think
in terms of this metaphor
of a great urn which
contains a lot of balls.
The balls represent
different technologies
that can be discovered,
or more broadly,
the different ideas
that we can invent.
And throughout human history,
we have reached into this ball
repeatedly and pulled
out ball after ball.
And on balance, all these
discoveries, all these ideas,
have been an
immense boon for us.
It's because of all these ideas
that we now live in abundance,
and why there can be
seven billion of us.
There have been some
discoveries, perhaps,
that have been mixed
blessings, that
have done both good and ill.
And a relatively
small number-- it's
not totally trivial
to think about it,
but balls that we would have
been better off without having
extracted from this
urn, like discoveries
that we would be better without.
I mean, maybe chemical weapons,
say, or nuclear weapons.
Or perhaps, like, torture
instruments of different kinds.
There are few things that
seems to have been clearly bad.
But there hasn't been
any discovery so far made
that is such that
it automatically
destroys the civilization
that discovers it.
So there hasn't been any black
ball pulled out from this urn.
And we can ask what that kind
of discovery could look like.
What would be a
possible discovery, such
that it kind of almost
automatically spells
the end of the discoverers?
And it might be useful here to
think of a counter [INAUDIBLE].
So we discovered, just over half
a century ago, nuclear weapons.
And it turned out,
fortunately, that in order
to make a thermonuclear
device, you
need some difficult-to-obtain
raw materials.
You need highly enriched
uranium or plutonium.
And the only way to
get those is by having
some large facility that's
very expensive to build,
takes a lot of energy,
is easy to see.
So very few people can build
their own nuclear device.
But suppose it had turned
out to be differently.
Suppose it had
turned out instead
that it had been possible to
make a thermonuclear warhead
by some simple procedure, like
baking sand in your microwave
oven, OK?
So now we know physics
doesn't allow for that.
But before you actually did
the relevant physics, how could
you possibly have known
whether particle physics would
have provided some
easy route to unleash
these kinds of entities?
So if that had
been the case, then
presumably that would
then be the endpoint
of human civilization.
Once it became so easy
to destroy entire cities,
we could never
again have cities,
and maybe we would have been
knocked back to the Stone Age.
And by the time we
would have again
climbed back up to
the technology level
where somebody could
build microwave ovens,
we would presumably
fall back again.
And that might be
forever the end of it.
But so we were lucky
on that occasion,
but the question
is whether we will
continue to be lucky always.
Like, whether in this big urn,
if we keep extracting ball
after ball, whether
eventually we
will pull out the black ball.
If there is a
black ball in there
and we just keep pulling
them out, then eventually,
presumably, we will get it.
And we don't yet
have the ability
to put a ball back in the urn.
We can't undiscover things
that we have discovered.
So here is a kind of
quick list of some areas
where one might
suspect that there
could be these kinds
of black balls.
And AI is one that I'll kind of
come back to more in this talk.
There are some others.
Synthetic biology will, I
think over the coming decades,
vastly increase the
powers of human beings
to change the world
around us and ourselves.
Those powers might be
used wisely or not.
Molecular nanotechnology.
Not the kind of thing that makes
car tires today, but some kind
of more advanced future version
of that, like Eric Drexler
imagined.
Totalitarianism-enabling
technology.
So remember, again, the
definition of an existential
risk-- not only extinction
scenarios, nice but also ways
to permanently lock ourselves
in to some radically
suboptimal state.
And you can imagine that maybe
new technological discoveries
that make surveillance very
easy, or some new discovery
that makes it possible,
through psychological or
neurophysiological
techniques, to modify desires
could sort of change
some of the parameters
of the sort of
sociopolitical game, where
new types of social
organization becomes
a lot easier to
establish and maintain.
Human modification,
geoengineering.
There are more you
could add there,
and I've left a lot
of these bullets
below here on the list unknown.
So it's useful to reflect
that if this question--
what are the biggest
existential risks?--
had been asked 100 years
ago, then presumably none
of the ones that I now would
place close to the top would
have been listed.
They didn't have
computers, so they
wouldn't have listed
machine intelligence.
Synthetic biology was not even
a concept, nor nanotechnology.
Maybe they would have worried
some about, like, totalitarian
tendencies.
But the others, not so much.
So if we reflect on
our situation today,
we have to maybe acknowledge,
from standing outside
and looking in, that
there are probably
some additional existential
risks that are not
yet on our radar but
that could turn out
to be as significant
as some of the others.
Which as I said, there could
be high value to doing analysis
on this and research to
try to find them out.
But if one combines these
considerations with one
other hypothesis-- say,
a mild or moderate form
of technological
determinism, which I think
is true-- the idea, basically,
that assuming science
and technology continues,
there's no global collapse,
then eventually we'll probably
discover all technologies
that could be discovered.
At least all
general-purpose technologies
that have a lot of
implications in many fields.
I think that's fairly possible.
It's not assured, but that level
of technological determinism
seems quite possible to me.
It's a little bit like-- if
you think of a big box that
starts out empty and
you pour in sand in it,
this is like, you can fund
one kind of research here.
You can fund another
kind of research.
And what research you fund,
where your priorities are,
that determines where the
sand piles up in this box.
So you get different
technologies,
depending on what you do.
But over time, if you just
keep pouring in sand, then
eventually the whole
box will fill up.
And so that seems
fairly possible.
Now if one has
that view, then how
should one kind of--
what attitude should one
take to all of this?
Like what should we do?
So one possible
response is one I
think best expressed
by this blogger.
I don't know who it is.
Washbash commented on some
blog that "I instinctively
think go faster.
Not because I think this
is better for the world.
Why should I care about the
world when I'm dead and gone.
I want it to go fast, damn it!
This increases
the chances I have
of experiencing a
more technologically
advanced future."
So here we've got to be clear
what exactly the question is
that we're asking.
So if the question is,
what would be best for me
personally?
What should we
favor or hope for,
from a egoistic point of view?
Then I think that
Washbash is correct.
From an individual point of
view, if-- well, first of all,
if you're somehow hoping
for these cosmic lifespans
of millions of years, and
being able to travel and expand
into the universe,
then clearly that's
not going to happen unless
something radical changes.
Like the way things are
going, I'm sad to say,
we're all just going to die
from aging in a few decades.
Like we're all rotting.
So the only way that that
could possibly change
is some radical upset.
Like some cure for aging,
or uploading into computers,
or something really
radical would
have to happen to
kind of thwart that.
So that would be reason to
favor faster technical growth.
Or even, if you're
despairing of that,
even you could just hope to have
more interesting gadgets around
and a higher standard
of living, which
we can hope for
through technology.
However, if the question
we ask is instead,
what would be best from an
impersonal point of view?
Then I think the answer is quite
different, something perhaps
closer to this principle of
technological development,
rather than maximize the speed
with which you rush ahead.
This principle would
say that we should
"retard the development
of dangerous and harmful
technologies,
especially ones that
raise the level of
existential risk,
and accelerate the development
of beneficial technologies,
especially those that reduce
the existential risks posed
by nature or by
other technologies."
So the idea here is
that rather than asking
the question for some
hypothetical technology,
would we be better
off without it?
We ask a different question.
Because basically,
on this moderate form
of technological determinism,
that's just not on the table.
We can't relinquish a
technology permanently.
But what we should think of
instead is on the margin,
should we try to hasten the
arrival of some technology
or slow it down?
We might be able to make
some difference there,
say, by a couple months.
And we want to think about
how that small difference will
influence our
likelihood of harvesting
this cosmic endowment.
If you think that
it was literally
impossible to even make a
small difference in the timing,
then that would mean that all
the funding and all the effort
that goes into
technology development
would just be wasted.
So presumably we think
we have some ability
to at least move
things around in time.
And the principle
of differential
technological
development suggests
that it might be quite
significant, sometimes,
exactly when different
things arrive,
particularly the sequence in
which different technologies
arrive.
So if there's going
to be, at some point,
like a really harmful
bio-engineered pathogen,
and it could spread really
easily and it's very lethal,
and there's going to
be, at some point,
like a universal
vaccine, you want
to invent the vaccine before
you invent the pathogen.
If there's going to
be, at some point,
machine superintelligence,
and if there
is some possible technology
that could assure
the safety of machine
superintelligence,
you want the latter to
come before the former.
AUDIENCE: There's an
argument that trying
to retard development
of technology
would make it more dangerous
because you're driving it
underground, or you
have less opportunity
to do it out in the open,
develop safe [INAUDIBLE].
Like we had, for example,
the Asilomar guidelines
in biotech, which have
actually been very effective.
For 30 years, there's
been no accidents.
And if you drive these
technologies underground,
you don't have the
opportunity to have
those kinds of safeguards.
NICK BOSTROM: Yeah.
This-- the principal
leaves open whether you
should focus on the retarding
or the accelerating.
If you wanted to
retard, maybe one way
would be just to refrain from
like funding it or actively
devoting yourself
to accelerating it.
With regard to AI,
which I'll get to later,
I think definitely accelerating
the work on the safety problem
is clearly the way to go.
I think it's just a lot easier
to make a big difference there
than to try to somehow retard
the development of AI work
itself.
Maybe we can return to
that more in the Q&A.
So we have a picture, perhaps,
like this, where again, we're
looking at three axes here.
So technology on one, this
is the same capability
on the earlier slide.
Coordination-- some measure of
the degree to which humankind
is able to solve a global
coordination problems.
Avoiding wars and arms
races, and polluting
our communal resources.
And insight-- so a measure
of our understanding
into what uses of our
capability would actually
make things better.
So it might well
be that in order
to have the best
possible outcome,
to have Utopia, that we
need maximum amounts of all
of these.
Like super-duper
advanced technology
is necessary to realize the
best state; great coordination,
so we don't use that technology
to wage war against one
another, as we have through
so much of human history;
and great wisdom, so that
we apply all these abilities
to really do things
that are worthwhile.
So that might be
where we have to be,
if we want to realize
the best possible state.
Now that then leaves
open the question
of whether from the
position we are currently
in-- at the moment,
we would be better off
with faster developments
in each of these areas.
It might be, for example,
that even though we ultimately
want maximum technology,
we would be better off
getting that technology
only after we have first
made more progress on global
coordination or wisdom.
Anyway, so that's by view
of like a broader context.
So we are thinking about other
existential risks and stuff
like that.
And superintelligence,
as I will talk about,
I think is one big
existential risk, perhaps.
Perhaps, arguably,
perhaps the biggest.
I'm not sure.
But it's peculiar
in one respect,
that although it's a big
danger in its own right,
it's also something
that could help
eliminate other
existential risks.
So if we imagine like a
very simple model, where
we have synthetic biology,
nanotechnology, and AI--
we don't know which
order they will come.
Maybe they each have
some existential risks
associated with them.
Suppose we first develop
synthetic biology.
We get lucky and we get
through the existential risks,
however big they are.
And then we reach
molecular nanotechnology,
and we are lucky.
We get through that, as well.
And finally, AI, the existential
risks along that path
are kind of the sum of these
three different ones that we'll
each have to surmount.
In another trajectory,
maybe we get AI first,
and we have to face
existential risk for that.
But then if we do
get lucky there,
we no longer have to face the
risk with synthetic biology
and nanotechnology,
because we don't
have the superintelligence
to help us through.
So in reality, it gets a lot
more complicated than that,
and we can discuss the
intricacies more in the Q&A.
But thinking about the
sequencing and timing,
I think, rather than
yes or no, would we
want the technology
or not, is like
an initial, necessary
first step to be
able to have any kind of
meaningful conversation
about this.
So superintelligence, I think,
will be a big game-changer,
the biggest thing that will ever
have happened in human history,
at some point, this transition
to superintelligence.
There are two possible
pathways, in principle,
one could imagine
that could lead there.
You could enhance
biological intelligence.
We know biological intelligence
has increased radically
in the past, in kind of
making the human species.
Or machine intelligence,
which is still
far below biological
intelligence,
insofar as we're
focusing on any form
of general-purpose smartness
and learning ability,
but increasing at
a more rapid clip.
So specifically, you can
imagine interventions
on some individual brain to
enhance biological condition.
I'll say a few words
about that just shortly.
Or improvements
in our ability to
pool our individual
information processing devices
to enhance our collective
rationality and wisdom.
I won't talk about
that, but that's clearly
an exciting frontier, with the
internet and new institutions,
prediction markets and
other things like that.
There are some kind
of hybrid approaches,
it can vary between biology and
machines, the cyborg approach.
I personally don't
think that that's
where the action will be.
It just seems to me very
difficult, technologically
speaking, to create implants
that would really significantly
enhance our cognitive ability
more than you could have
by having the same device
outside of yourself.
So you could say, wouldn't it
be great with a little chip
in the brain, and
you could Google just
by thinking about it?
And well, I mean, I
can already Google,
and I don't have to
have neurosurgery
to be able to do that.
We have these amazing interfaces
like the eyeballs, that
can protect 100 million
bits per second,
straight into dedicated
neural wetware that's
highly optimized for making
sense of this information.
And it's really hard
to beat that, I think.
In any case, I mean, the rate
at which sensory information
can be entered into the brain is
not really the limiting factor.
The first thing the
brain does with all
of this visual information is
to throw away almost all of it
and just extract
the relevant part.
And then different versions
of machine intelligence,
where on the one
hand, we have sort
of purely synthetic methods
that don't care about biology
but try to make progress in
mathematics and statistics
and figure out
clever algorithms.
And two, approaches that
try to learn from this one
general intelligence system
that already exists, that we can
study the human brain for
inspiration from that,
or maybe even
reverse-engineer it.
Or in the limiting case,
literally copying it
in whole-brain
emulation, where you
would take a particular
human brain and freeze it
and slice it up,
feed those slices
through an array of microscopes
to take good pictures.
So you have a stack
of these pictures
and use automated
image-recognition software
to extract the connectivity
matrix of the neural network
that's was in the
original brain.
And then annotate that with
neurocomputational models
of each type of neuron works.
And finally, run
that whole emulation
on a sufficiently
powerful computer.
That would require some very
advanced enabling technology
that we don't yet have, so
we know that that is not just
around the corner.
On the other hand,
it would not require
any theoretical breakthrough.
It would not require any new,
deep conceptual understanding
of how thinking works.
You would only
need to understand
the components of
the brain to be
able to make progress with that.
So it's an open question which
of these will get there first.
Different researchers have
their own favorite bets on that.
One thought that
sometimes is put to me
is that-- OK, so Nick, you're
worried about this AI stuff.
So maybe what we
should do is really
try to push ahead with
biological enhancement,
so that we can kind of
keep up with the computer.
The computer's going
to get smarter,
but maybe if we enhance our own
intelligence rapidly enough,
we can keep one step ahead.
I think that that's misguided.
And in fact, if we
do figure out ways
to enhance biological
cognition, I
think that will only hasten the
time when machines overtake us.
Because basically, we will have
smarter people doing the AI
research and the
computer science,
and they will solve
the problem faster.
I still think that would
probably have reason
to try to accelerate biological
cognitive development.
But not so that we can keep
ahead of the computers,
but that so that
when the time comes
where we will create
intelligent machines,
that we will be more
competent at doing so.
So let me just say a few words
on this biological cognitive
enhancement, because it
might be-- especially
if you think of arrival dates
for artificial intelligence,
where it's not just
around the corner,
but maybe it will happen in
the latter half of this century
or something like that-- by that
time, that could be enough time
to have a new cohort of
cognitively enhanced people
around.
And the technology
that I think will first
enable cognitive
enhancement-- my best guess is
that it'll be through
genetic interventions.
There are other paths,
obviously-- smart drugs
and such.
I just-- I don't
hold out much hope
that they will do a
great deal to improve
general-purpose smartness.
They might-- if there
were a simple chemical
that you could just inject
and it would make you
a lot smarter, I think
evolution would find a way
to endogenously
produce that chemical.
I think there might
be ways to improve
some peripheral characteristics,
like mental energy, say,
or concentration.
And we can see that evolution
would have optimized us
for a certain type
of environment
where there are
trade-offs between, maybe,
metabolic consumption
of the brain
and the amount of
mental energy you have.
And in the environment of
evolutionary adaptiveness,
the optimum point for that
trade-off is at one place,
and now we want to move that.
And maybe it could have some
stimulant that just increases
the burn rate of calories and
give you more mental energy.
So peripheral
adjustments like that, I
think we could do
maybe through drugs.
But raw cleverness,
I think genetics
is a more likely initial
technology to do that.
And so one way that
that could work
is in the context of in
vitro fertilization, where
you have normally, in the
course of standard fertility
procedure, maybe some six,
eight, or ten eggs produced.
And then the doctor chooses
one of those to implant.
And at the moment, you can look
for like obvious abnormalities.
You might screen for
some monogenic disorders
or for Down syndrome,
which is done.
But you can't really
select positively
for some complex trait
today, because we don't yet
know the genetic architecture
for, say, intelligence.
But we will, I think,
soon know that,
because the price of gene
sequencing is falling,
and it's now coming
down sufficiently where
it is becoming feasible to run
these very large-scale studies
with hundreds of thousands
or even millions of subjects.
And because it turns out that
the additive heritability,
the variance in that
in humans, is not
due to like one
or two genes that
differ in between us, but a
lot of genes-- maybe hundreds,
maybe even a few
thousand-- that each
have a very, very small effect.
And so to discover a
very, very small effect,
you need a very
large sample size.
And so you need to
sequence a lot of genomes,
and that was too
expensive to do, really.
But now there are studies
underway with hundreds
of thousands of people,
and maybe soon millions.
So that, I think,
will tell us some
of this information
that would be needed.
And then to start doing this,
nothing else would be required.
No new technologies at all.
You just have the information
and you sequence it
and select the
embryos based on that.
Now this would be vastly
potentiated if it were combined
with another technology
that we don't yet
have ready for use in
humans, which is the ability
to derive gamete
from stem cells.
So then you could do
iterated embryo selection.
We would generate an
initial pool of embryos,
select one that's highest
in the expected trait
value of interest, and
then use that embryo
to derive gametes-- sperm
and ova-- that you could then
recombine to get a
new set of embryos.
You pick out the best of
those, and you repeat.
So this technology
here, the ability
to create artificial
gametes through stem cells,
has been developed
and done in mice.
But a significant amount
of additional work
would be required to make
it safe for use in humans.
But if you had this-- and this
might take anything from 10
to 30 or 40 years.
It's hard to know.
This would have the effect of
collapsing the human generation
cycle from 20, 30 years
to a couple of months.
And so if you imagine this kind
of old mad scientist eugenics
program where they would breed
humans for like 500 years,
and make very sure who mated
with whom-- which setting aside
all the ethical complications
involved in that, which
are legion, but I'm not going
to talk about them here,
not because I don't
think they are there,
but I just want to focus
on the technical stuff--
it's just infeasible on a
lot of different levels.
But here, you would
instead have something
that could be done--
instead of 500 years,
you could have it
done over a year.
And instead of changing
the breeding patterns
of large populations,
you would have
a Petri dish and a scientist
plucking around in that.
And so through
that, you would be
able to probably achieve sort of
weak forms of superintelligence
in biology.
I did an analysis
with a colleague
of mine, Carl Shulman,
quite recently
where we tried to estimate,
for different assumptions
about the selection
power applied,
what the gain in
intelligence would be.
And so you can see
here that if you just
produce two random embryos
and select the best one-- not
the one that's actually best,
but the one that looks most
promising, to the
extent that there
is an additive
genetic heritability,
you may get four IQ
points from that.
So if instead, you select
the best of 1 in 10, or 11,
you can see here, even if you
could select the best of 1
in 1,000, you only get
maybe 24 IQ points.
This is with
single-shot selection.
But if you did this
iterated embryo selection,
and you could do five
generations of selecting
the best of 1 in
10, then you might
get as many as 65 IQ points.
And with 10
generations of 1 in 10,
you'd get far above what
we've had in human history.
You'd get the kind of
phenotypes that have never
existed in all of human history,
the [INAUDIBLE] and stuff
like that.
So you observe here that while
you get quickly diminishing
returns by just doing
one-shot selection
from a pool of
embryos, you largely
avoid that by doing
this iterated selection.
So yeah, I'm going
to skip through this.
Yeah, so that does
look like it should
be feasible without any sort of
magical new technology coming
around.
And then that also, I think,
adds to the possibility
that we will eventually
get AI stuff.
Like that would be towards
the end of this century,
if we haven't already
solved the problem by then,
with this significantly more
capable generation of humans
working on it.
But ultimately, we
will be surpassed
by intelligent
machines, assuming
we haven't succumbed to
existential catastrophe
prior to that, just
because the fundamental
image-to-information processing
in the machine substrate
or just far beyond
those in biology,
like in terms of speed.
Even transistors today are
far faster than neurons.
So I'm going to--
this is not relevant,
for you guys already
know all of this.
So there is, like, progress
in AI, like-- I'm just
saying the public
consciousness is
shaped by a few big milestones,
but there's a lot of progress
under the hood.
Also hardware has driven a
lot of progress we've seen.
Here is a slice that
could have been earlier.
This is like with
the brain emulation.
This is basically the
state of the art today.
Here's a brain slice scanned
with an electron micrograph.
Here is a stack of those
pictures on top of one another.
And here is the
result of applying
an image-recognition
algorithm to extract
the connectivity matrix.
But although we have
the right resolution--
you can see individual atoms,
if you want to, in the brain.
It's just that to
image the brain
with that level of resolution
would take, like, forever,
so that we're presumably at
least decades away from making
something like that work.
Lot of application
there [INAUDIBLE].
So the question of how far away
we are from human-level machine
intelligence, I think the short
answer is that nobody knows.
We did a survey of leading
AI experts last year,
and one of the questions
we asked was, by what year
do you think there is a 50%
chance that we will have
human-level machine
intelligence?
Here defined as
one that could do
most jobs that humans could do.
And so the median answer to
that we got was 2050 or 2040,
depending exactly which
group of experts we asked.
That seems, to me,
roughly reasonable,
for what it's worth.
We also asked, by
what year do you
think there's a 90% probability?
And we got 2070 or 2075.
That, to me, seems
overconfident.
There is just a lot more than
10% probability, I think,
that we will still
have failed by then.
I should say, as a footnote,
that these estimates were
conditioned on no global
collapse occurring.
So-- so maybe the
numbers would be--
like the years would
be slightly higher
up if we hadn't made
that assumption.
We also asked, if and when we
do reach human-level machine
intelligence, how
long do you think
it will take from there to go to
some radical superintelligence?
And you can see for
yourself the answer there.
Now here, again,
my view disagrees
with those of people we sampled.
I think-- I'm quite agnostic as
to how far away we are from AI.
I think we should basically have
a very smeared-out probability
distribution.
I do think there is a fairly
large probability, though,
that if and when we
get to human-ish level,
we will soon after
have superintelligence.
I place a fairly high
credence on there
being, at some point, an
intelligence explosion.
So we need to sharply
separate the two questions,
like the distance between
now and human-level,
and the distance in
time between that
and radical superintelligence.
I think this transition
might well be very rapid.
And things depend on that.
So if you distinguish,
like qualitatively,
like fast-take of
scenarios, where
we go from something
human-ish level
to superintelligence within
minutes or hours or days,
a couple of weeks, in
that kind of scenario,
it happens too fast
for us to really
be able to do anything much
about it while it is happening.
If we get a desirable
outcome, it's
because we set up the initial
conditions just right.
By contrast, if one
contemplates very slow take-up--
so you have some
human-level system,
and then only by
laboriously adding
additional little incremental
capability after capability,
so it takes like
decades or centuries
to work your way up
to superintelligence,
then that would be a lot of more
time for new human institutions
to arise to deal with
this, like to develop
a new profession of experts to
deal with this, to try things
out, see what works,
and then change it up.
So it makes a difference.
Another way in which
it makes a difference
is that in the fast
takeoff scenarios,
it's likely that you will have
a singleton outcome, I think.
Which is basically
a world order where
at the highest level
of decision-making,
there's like one
decision-making agency.
If you think about competing
technology projects,
whether it's nations
racing to build satellites,
or nuclear weapons, or
competing tech products,
often there's some
competition, and you're
trying to get there first.
But it's rare that the
difference between the leader
and the closest follower
is a couple of days.
Like usually the leader
will be a few months ahead
of the follower, or
a couple of years.
So if the takeoff is going
to be over in a few days
or a few weeks, then one project
will have completed a takeoff
before the next one will
have started it, very likely.
And then you will have a
mature superintelligence
in a world which contains no
other even vaguely comparable
system.
And for reasons that I'll
be happy to elaborate on
in the Q&A, and that a
lot of the book is about,
as well, this first
system then is
likely to be very powerful,
maybe to the point where
it is able to shape the
entire future according
to its preferences.
If you have a storied
takeoff, then it's more likely
you're going to have
multiple outcomes.
No system is so far
ahead of all the others
that it can just
lay down the law.
They end up
superintelligent, but it
will have economic
competitive forces
and evolutionary
dynamics working
on this population of digital
minds shaping the outcome.
And the concerns in
that type of scenario
look very different
from the ones
in the singleton scenarios.
Not necessarily less
serious, but different.
So instead of having one agency
that can dictate the future,
you now have this
ecology of digital minds.
And you can think--
I mean, suppose
to take a model-- so once we
had human-level minds that
were digital, like they could do
exactly the same as humans do,
and run at the same
speed, initially--
suppose that you get there
through whole-brain emulation,
and this is the first
type of AI you have.
Then you could very quickly
have a population explosion.
So we know how to copy software.
That takes a couple of minutes.
And so as long as
the productivity
of these digital mind is
higher than the cost of making
another copy, there
would be vast incentives
to just keep making more
copies until the income
that digital minds
can earn equals, like,
the price of electricity
and hardware rental.
So you have a
Malthusian situation
where the population
of digital minds
expands until the wage
falls to subsistence level.
But subsistence level
for the digital minds,
rather than for
biological minds.
So we are a lot more
expensive, because we
have to eat and have houses to
live in and stuff like that.
So humans might still be
able to make some income
through their
capital investment.
And there's then the question
of whether in this world, which
is increasingly shaped
by the digital minds--
there are trillions
and trillions of them,
and they're getting faster
all the time, and better,
and humans constitute a
small slice of all of this--
whether we would be
able, in the long run,
to really enforce
property rights
and our sociopolitical
structures.
Or whether these digital
minds would eventually
just swamp us and
expropriate us.
And at some point,
presumably, even
in this whole-brain
emulation, at some point
probably fairly soon
after that point,
you will have
synthetic AIs that are
more optimized than whatever
sort of structures biology
came up with, that will then
kind of leave the [INAUDIBLE].
So there is a chapter
in the book about that.
But the bulk of the book
is-- so all the stuff
that I talked about, like how
far we are from it and stuff
like that, there's like
one chapter about that
in the beginning.
Maybe the second
chapter has something
about different pathways.
But the bulk of the book is
really about the question of,
if and when we do
reach the ability
to create human-level machine
intelligence-- so machines
that are as good as we
are in computer science,
so they can start to improve
themselves-- what happens then?
And what happens when you
have a superintelligence that
might be extremely powerful?
What are the control
methods that we
could try to apply to achieve a
controlled detonation if there
is going to be an
intelligence explosion?
How could we set up
the initial conditions
to get some kind of
beneficial outcome?
And there are a lot of
initially plausible ways
to solve this problem that
turn out, on closer reflection
not to worry.
That this kind of one
of the types of progress
at have occurred in this field.
It's like a deepening
appreciation
of just how profoundly
difficult this problem is,
of how you could create
something vastly smarter
than you and still ensure
a desirable outcome.
So that's the bulk of the book.
And then the last
two chapters are
trying to think more generally
about these macrostrategic
questions, and how
to think about what
our levers of influence are,
if one wants to increase
the probability of
a desirable outcome.
So I'll put on the
pause there, because I
want to make sure we get a
little bit of discussion in.
Thanks.
[APPLAUSE]
MODERATOR: Thank you, Nick.
We will use the microphone
for questions, please.
AUDIENCE: I'll just
comment quickly.
That 40-year median
for when we'll
achieve human intelligence
is-- I've been tracking that.
It was about 300 to
400 years in 1999.
It was maybe 50 years in 2006.
We took a poll at this
conference at Dartmouth.
And now it's 40 years.
I'm saying 2029, but it's
actually not so far off.
I don't think we're going to
get that far with enhancing
biological intelligence, because
our biological circuits are
just inherently a million
times slower than electronics,
and so there's only so
far you can get that way.
Whole brain emulation is
useful not to create an AI,
but to be able to emulate a
brain, or more likely a portion
of a brain, to establish the
functional description of what
these basic circuits do to
guide our creation of AI.
My view, though, is that we are
emerging with this technology.
I mean, it's already-- during
that one-day SOPA strike,
I felt like part of my
brain went on strike.
And so we're already
enhanced by these devices.
When I was in college, I'd take
my bicycle to the computer,
and now I carry it on my belt.
I believe we will-- these
devices are getting smaller.
I think within,
say, the '20s, '30s,
they'll go inside
our bloodstream
and go into our brain.
Basically put our
neocortex on the cloud
so we can extend the
300 million modules
we have in the neocortex.
In the cloud, there
will be a hybrid.
But I would agree
that ultimately,
the non-biological portion will
be so powerful that it will
dominate, but that's a path to
getting to superintelligence.
But I would argue that
the non-biological portion
is human intelligence.
I don't think it's
non-human just
because it's non-biological.
NICK BOSTROM: Yes,
so whether-- I
mean-- I guess one
doesn't want to be bogged
down in the
terminology of whether,
like-- it seems clear to
me to call it non-human.
But the idea that
it's implemented
in machine substrate
to me doesn't
begin to answer the question
of whether the outcome is
desirable or not.
To me, it would all depend
on exactly what kind
of intelligence is there
in this machine substrate,
and what this is it doing?
What is it using
its resources for?
Like I could-- you
could imagine that we're
discovering that we are
all in a simulation,
we're already all digital.
Like so what?
I mean, that doesn't mean
that human life doesn't
have any moral significance just
because we're not biological,
as we thought.
So in principle, you
could have a digital mind
with exactly the same experience
and capabilities as we do,
and presumably it should
count for the same morally.
However, there are a lot of
really bizarre types of minds
that are possible in principle,
and I think one of the slides
further down, and one of the key
questions that the book tries
to answer, is how can we
think about the motivations
of superintelligent agents?
Is it possible to say
something useful about what
they would want to do?
AUDIENCE: We're all
evolving together.
There's, like, 2
billion people that
are enhanced with
these devices now.
And as they get more
intimate with us,
it's not going to be like these
science futures of movies,
of one evil corporation that's
got got this technology.
It's going to be billions
of us that enhance together,
like it is today.
NICK BOSTROM:
Yeah, so the growth
of collective intelligence.
I mean, I think
that at some point,
the fleshy parts that
are in crania will-- a.,
they will be a lot
harder to enhance,
and they will become just
kind of negligible part
of the actual intelligence
that is created.
And that everything
then depends upon us
having set up the
initial conditions.
So, like, superintelligence
will be extremely powerful.
We have the one advantage, that
we get to make the first move.
And I think we only
get one try there.
Because once you have like an
unfriendly superintelligence,
it will resist you sort
of changing its values.
And so part of what makes
the problem so challenging
is that you need to get it
right on the first attempt,
and humans are generally
not very good at that.
We like to sort of see how
things work out and patch
things up and learn
from experience.
AUDIENCE: I want
to explore what you
mean when you say a desirable
outcome, what desirable means.
There this old philosophical
problem of the utility monster.
It's sort of a challenge to a
utilitarian notion of morality,
which is, imagine that
there's some creature that
wants something more than the
rest of humanity combined,
feeding the one
thing that it wants
because it wants
it so much more.
Maximizes utility, ignoring
the rest of humanity.
So in some sense, the
superintelligence scenario
can give life to the
utility monster in the sense
that if the cognition
after the explosion
is vastly greater than the total
sum of cognition of humanity,
then perhaps the moral
consideration of what
a desirable outcome should be
should only be paying attention
to what it wants,
not what we want.
NICK BOSTROM: Right.
AUDIENCE: So I wanted to
raise that as a challenge.
I'm not advocating
that perspective,
I want to see how you reason
about desirability in a world
where we're coexisting
with superintelligence.
NICK BOSTROM: Yeah,
generally speaking,
it's easier to describe what
an undesirable outcome would
be than a desirable one.
So there are a lot of ways in
which things could turn out
that, by most
reasonable [INAUDIBLE],
we would regard as
pretty worthless.
Like the standard example
in this little literature
is the paper clip maximizer.
So an AI that's
superintelligent, and has
as its only final,
highest-level goal
to maximize the number of
paper clips it produces.
This is a stand-in for
some other arbitrary goal.
But most final
goals, if you think
through how the world would be
structured in order to maximize
the realization of
that goal, would
involve, as a side effect, the
elimination of human beings
and everything we care about.
So if you're a superintelligence
that's a singleton,
and you want to make sure
there's as many paper clips as
possible, for a start, you'd
want to get rid of all humans.
Because maybe we'll want
to switch off or something
like that, and then there'll
be fewer paperclips.
We also have bodies that
are full of juicy atoms that
could be used to make some
really nice paper clips.
So then you think, OK,
that's not do paper clips.
That's ridiculous.
But then you think
of something else.
Like what about an
AI who only wants
to calculate decimal
expansion of pi?
So similarly, such
an AI would want
to maximize the
amount of hardware
it has so it can make more rapid
progress in this calculation.
And it actually turns out to
be quite difficult to specify
a goal that would not be
maximally realized in a world
where not just human biological
organisms are extinct,
but also anything we would
possibly place value on
is eradicated.
AUDIENCE: So the
premise there is
that-- I want to really focus
on the premise, because I think
the argument hinges
on it-- that we're
taking a snapshot
of what it is we
value today, where "we" includes
the things that we consider
to be adequately
cognitive today.
And we are ignoring in our
definition of desirability--
Let's go to the extreme of
the paper clip scenario.
A utilitarian might say, well,
OK, if it wants paper clips,
but its overall cognition
is vastly greater
than the rest of
humanity as a whole,
well, then that's what it wants,
so the weighted definition
of desirability should be
to maximize paper clips,
because that's what it wants.
NICK BOSTROM: Well,
OK, so there are
different versions
of utilitarianism.
There is preference
satisfactionism,
which I think is
what you alluded to,
which would stipulate some sort
of social welfare function that
is maximized by fully-satisfied
single preferences that exist.
There's a big problem of how to
aggregate them, but something
along those lines.
Other utilitarians would say
maximize pleasure or maximize
happiness or maximize some
other part, the common feature
being that the
value of the whole
is, as it were, the sum
of the value of the parts.
If you thought preference
satisfaction is and was
correct, you might
want to design agents
with easy-to-satisfy
preferences.
Like they want there to be
at least three prime numbers
or something like that,
and then you're done.
And then maybe to have as many
as possible of those agents.
Like the minimum
agent that would
count as a morally
considerable being.
But that seems like a fairly
impossible moral view.
But one can decompose this big
sort of problem into two parts.
On the one hand, you have
the technical problem
of-- if you specified some
value in human language,
like whether it's to maximize
happiness or freedom or love
or creativity, whatever it
is, that how could you sort of
embed that into a seed
AI, like an AI that's
destined eventually to
become a superintelligence?
So this is like an
enormous technical problem.
Because like in C++, you
don't have a primitive saying
"happiness," right?
You have to define
all of these terms.
Some goals would be
feasible, like maybe
to calculate as
many digits of pi.
It's something we
could do today.
Others, like, there's this big,
unsolved technical problem.
But then on top
of that, you also
have the second problem, which
is the value selection problem,
like trying to figure out which
value it is that you would want
to get in there in
the first place.
And both of these are places
where we could easily stumble.
So just to reflect on,
like-- if the idea was
to try to do some AI that
was ethical, or maximally
always did the morally
right thing to do,
if we try to achieve that
by just creating a list
or somehow embedding our current
best understanding of ethics
into a final goal, we should
reflect that if any earlier
age had done this
with their values,
it would have been what we
can now see are catastrophes.
Earlier ages were condoning
slavery or human sacrifice,
and all kinds of abuses of
different minorities and stuff.
And presumably even
though we might
have made some progress
towards moral enlightenment,
we haven't gotten
all the way there.
So it would be
important to preserve
the possibility for moral
growth in the value selection.
And so there are a
number of different paths
that each should be
explored, because we're still
at such an early stage here.
But maybe one of the
more promising one
is this idea of
indirect normativity
that I describe
in the book, which
is the idea that rather than
trying to take explicitly
characterized some
desired end state,
we try to motivate the AI to
pursue some process whereby it
can find out what it is that we
were trying to work out when we
were working with this problem.
So suppose you could give
the AI the goal of doing
that, which we would have
asked it to do if we had had,
like, 40,000 years to
think about this question,
and if we ourselves
had been smarter,
and if we had known more facts.
So now we don't know
what that is, currently.
But it's an empirical question
that we could then hopefully
leverage the AI's
superior intelligence
to make a better estimate of.
And then that kind of
indirectly specified goal
might then be more likely to
produce an outcome that we
would recognize, on reflection,
as being worthwhile.
AUDIENCE: So I have a story.
The other day, I was reading
some of the news and analysis
about the crisis
in the Middle East,
and I guess I spent like
an hour thinking about it.
And I didn't come up with a
solution for the Middle East.
NICK BOSTROM: Ah, darn.
AUDIENCE: Now if I had been
a speedy superintelligence,
and in that hour I had spent
1,000 hours of thinking,
I think I still wouldn't
have come up with a solution.
So I think there are some
problems for which intelligence
by itself isn't the answer.
And you know, as humans,
we put sapiens in our name.
We think intelligence
is really important,
but it's not the only attribute.
I don't think it
solves all problems.
NICK BOSTROM: Yeah.
So I mean, I agree with that.
A lot of sort of sociopolitical
problems in the human realm
often depend on people with
conflicting preferences.
There might just not
success one solution
that would maximally
please everybody.
And with the case
of the AI, I mean,
I think that in fact, the most
important problem to work on
is not the intelligence
problem, which
hastens the day
where we'll have it,
but rather this control problem.
How to ensure that it would
deploy its intelligence in ways
that are not harmful.
And just briefly, there's,
I think, two broad classes
of control method that's
one can envisage here.
So one is capability
control method,
where you try to limit
what the AI is able to do.
So maybe put it in a box.
You unplug the ethernet cable.
You only allow it to communicate
by typing text on a screen,
let's say.
Maybe only even answers to
questions that are posted.
And you try to clip its wings.
And I think that
those can be important
during the development phase,
like before you actually
are ready to launch your system.
But ultimately, I don't
think they are the answer.
Because in order for
this AI to actually
have any effect on the
world, it will at some point
need to interact with it.
Like if you literally just
had an isolated box that
didn't closely interact with the
world, yes, it could be safe,
but it would also not
do anything at all.
But as soon as you have, say,
a human being communicating
with it, then you
have a weak link here.
Like humans are
not secure systems.
And even humans often succeed
in manipulating or tricking
or deluding other humans to
do their-- like scam artists.
And so if you had like a
superhumanly powerful persuader
and manipulator,
chances are eventually,
it would find a way to talk
its way out of the box.
Unless it could just
hack its way out,
like by-- so there are things
like, we think, oh, well,
we'll just put it in a box.
If we don't talk
to it, it's safe.
Well, maybe there's
some unanticipated way
that we haven't thought of,
like by wiggling its electrons
around in its
circuitry, maybe it
could create
electromagnetic waves that
could influence a
nearby apparatus
or something like that.
So then we think, oh,
put it in a Faraday cage.
But OK, so if we just keep
patching up all the flaws
that we can find,
then we will just
patch up all the
ones we can find,
but there are probably some more
ones that we can't think of.
And then it will
use one of those.
So the second class
of control method
is motivation selection
methods, where
instead of, or in
addition to, trying
to limit what the
system can do, you
try to engineer its
motivation system,
so that it would not
want to cause harm.
And that's then where this
indirect normativity comes in,
as one version of
that, and there
are many other many
other aspects of that.
And that's, I think, the
problem that ultimately we'll
need to solve.
AUDIENCE: So if you'd use these
two mechanisms to control it,
still, it comes back
to this question
on the other side
of the equation.
Like it somehow turns
its fitness function
into the will to dominate us,
because of its will to survive.
But we also have
that will to survive,
and even though
we make mistakes,
it seems like the argument
of a superintelligence coming
to completely dominate us
requires a lapse of attention
on our part, in our own
promotion of our desire
to survive, for long
enough for it to actually
be irretrievable.
So have you considered
that-- it seems
like even in all of the horrific
things that you've described
that could happen if a
superintelligence did
come to dominate, there would
be that take-off duration
period where we would presumably
wake up and unplug it.
NICK BOSTROM: Well, one would
imagine, if the developers are
somewhat sensible, that
they wouldn't actually
permit the take-off unless
they at least believed
that the system was safe.
So imagine a scenario where
they have maybe falsely deluded
themselves that there is
no flaw in their system.
Or maybe they're just
worried that there's
this competitor who's soon
going to release another system.
So even if they haven't spent
enough time on the safety, they
still--
But you have to
take into account
that you're dealing with
an intelligent adversary.
So even just a human-level
mind in this situation
could figure out that
it has an incentive
to pretend to be nice, whether
or not it actually is nice.
Like when you're weak and, at
the mercy of your programmers,
who are inspecting
you and seeing
if you're ready to be released,
and if you're an unfriendly AI,
you would want to sort of behave
cooperatively and pleasingly
and all of these things.
Like it can plan
ahead to that extent.
And only once you are sort
of strong enough that it
doesn't matter whether
anybody tries to stop you,
because they can't-- only
then would it be safe for you
to reveal your true nature.
So there is this fundamental
flaw in the-- so this
is one of those initially
plausible ideas that
don't seem to work.
Like you develop your AI.
You keep it in a sandbox,
like a secure environment,
and you watch it for a while
to see that it behaves nicely.
And only once you've
seen that it's
cooperative and
nice and friendly
there do you let it out.
And the flaw is that
there is this possibility
for strategic behavior,
that unfriendly AIs could
mimic a friendly AI.
And you mentioned something
about this survival desire.
So there is something like
that, but it looks different.
So we humans have--
we don't really
have a clean agent architecture.
There's not, like, one
final goal for most of us.
And there are lot
of different drives
that rise and fall in strength,
depending on the time of day
and the environment we're in.
But if you have this
architecture where
there is a clearly-defined
final goal,
and everything else
is pursued only
by virtue of being
conducive to the attainment
of this final goal,
then there are a couple
of theses that I
think help you think
about that kind of structure.
So on the one hand, you have
the orthogonality thesis,
as I call it.
This is the idea that values
and intelligence are orthogonal.
You could have virtually
any combination of them.
Like a really smart system could
be really benevolent or really
evil or have some bizarre goal,
like paper clips, or something
human-meaningful.
There's no necessary
ontological connection.
On the other hand, you also have
this instrumental convergence
thesis, which says that
for almost any final goal
and almost any
environment, there
will be certain instrumental
values that you will recognize
once you're smart enough.
For example, the value to
prevent your own death.
And so if you're a paper clip
maximizer, the only reason
that you don't want
to die, it's not
because you sort of
value being alive.
It's just that you predict
that there will be fewer paper
clips if you are
switched off today.
Because if you're
still around tomorrow,
you will still be working
to make more paper clips.
And similarly, goal
content preservation.
You can predict that if
somebody changed your goals,
then tomorrow,
you will no longer
be working to make paper clips.
Now you will be working
to make staplers,
and then there will
be fewer paper clips.
So you, being a paper
clip maximizer today,
will want to prevent somebody
from changing your goals.
And there are others, like
acquiring more material
resources, or enhancing your own
intelligence so that you become
better able to realize
whatever your goal are.
And it's that combination
between the lack
of any necessary connection
between final goal
and intelligence, and these
convergence instrumental
reasons to just
do things that are
inconsistent with
human values, that
creates the intrinsic
danger there.
You have to engineer a very
particular kind of final goal
to-- have a final goal such
that if it's actually maximally
pursued by a
superintelligence, would
be consistent with
human survival.
Maybe something that
kind of embeds within it
the same values that we have.
MODERATOR: So we've been talking
a lot about hypothetical stuff.
What about some concrete
stuff, namely policymakers?
So we're talking
here about scenarios
that are potentially
very dangerous
and that may scare
policymakers, whom
we know are technologically not
at the level of this audience
and may start making decisions
which will slow down or impede
the progress, or maybe even
ban computer science that
tries to do AI research
because of the fears
that crop up in some of that.
What are your thoughts on
the policymaking process
and legislature
process around issues
of artificial intelligence?
And can we expect
that, you know,
like computer scientists are
one day labeled as terrorists?
NICK BOSTROM: I don't
think that that's
very likely for various reasons.
It's hard at the moment
to see exactly what it
is-- even if policymakers were
willing to do something, what
they could actually do
that would be helpful,
rather than harmful.
At the moment, what
needs to be done,
I think, is more
foundational work
to build up a clear
understanding of what precisely
the problem is.
And then ultimately, it's mostly
a technical research challenge
to work out the solution
to this control problem.
It requires some top-notch
mathematical talent
working together with
theoretical computer scientists
and maybe some
philosophical expertise
to really crack this problem.
It's very hard to see how,
like, from some high level
of government-- so it's
a very blunt instrument.
And you might, even with the
best intentions at the start,
like at the top, once it filters
down through the bureaucracy,
it might have a very
different effect
than the one you intended.
So there are some
other existential risks
where I think it would
be easier to imagine
ways in which
regulation could help.
AI is particularly difficult.
Even just to understand what
the problem is is quite hard.
And it's hard to
imagine a scenario,
at least in the next
couple of decades,
where we would have some
kind of sane thing coming
from political processes.
Maybe the closest would be
like more funding for work
on the control problem.
But even that, once
it sort of filters
through the vested
interest and academia,
will probably
translate into a rain
of funding falling on a
wide range of superficially
related areas that might
not actually have anything
to do with the control problem,
like general computer security
or something like that.
But there are other
things that can be done.
So there are some organizations
that are working on this.
So we are doing some
work at the Future
of Humanity Institute at Oxford.
Another is the Machine
Intelligence Research
Institute, MIRI, at Berkeley.
They have some excellent
people, as well.
That would be an obvious thing.
And generally try to recruit
some of the brightest minds
of our generation and
the next generation,
to sort of focus on this.
At the moment,
worldwide, maybe there
are half a dozen people
or so, equivalent,
working full-time on
this, which is not
in proportion to the
importance of the problem.
It's a more general issue.
I did a little literature
survey a couple of years ago.
I just compared a number of
academic papers on the dung
beetle compared to a
number on human extinction.
And sad to tell you that
there was more than an order
of magnitude more
on the dung beetle.
So the positive spin
on that is that there
are enormous opportunities
for somebody who actually does
care to make a big difference.
Like even one
extra person or one
like extra million
or something, can
do a lot of good there,
because it's so neglected.
AUDIENCE: So regarding
policy and political things,
I think the general
underlying principle here
is that modern governments
are like big battleships
or big tanks.
They do very well against
large, stationary targets,
but against small,
mobile targets,
they're extremely ineffective.
And so if AI were like
nuclear weapons, where
in order to produce it, you
need these giant, static
manufacturing facilities
that are very expensive
and they're like
fixed in one place
so you can see where it is, then
the political aspect, how you
regulate it and whether you
regulate it, is very important.
But artificial intelligence
isn't like that.
You can develop it from
anywhere in the world.
Your computer
might cost $10,000,
and it might be anywhere in the
world, since you can do things
through the cloud.
And when governments
try to handle
these sorts of small
mobile targets,
like individual websites
or individual people
on the internet,
it doesn't really
matter, compared to the
nuclear weapons case,
very much what kinds
of things they do.
Because governments just
can't hit that kind of target.
It's like, you know,
piracy of software
is, in theory, punishable
by whatever penalty.
But as we see everywhere in the
world, those kinds of things
are totally ineffective at
achieving their stated goals.
NICK BOSTROM: It
depends a little bit
on what the scenarios
here that we're having.
Like, say, if there were some
scenario in which they would
try to prevent AI from
ever being developed,
I think that's a lot
more far-fetched.
And slightly more possible
scenarios where it became
clear which products
were going to succeed,
and that it was going ahead, and
then they would acquire that.
Like they would nationalize it.
But then that doesn't
solve the problem.
That just means that now
you have an encapsulation.
So maybe it's all placed
under the federal government,
and they have military
guarding the whole thing,
but you would still have the
same people inside, basically
working on the same problem.
And so that that outcome,
scenario, might not
make that much difference
one way or the other.
You still have the same basic
technical problem inside.
And it's also unclear
to what extent
it would be possible
for non-experts
to really be able to exert
micro-level influence
on the precise design of the AI.
I mean, you have
to know what you're
doing to be able to do that.
I think-- I mean, things that
they could do in general,
there are indirect things.
So working harder to achieve
global peace and coordination
would help with a lot of
problems, including AI.
Maybe it makes it easier.
And in the future, if there
were like a race dynamic
between different
countries, that they
could join together and do one
joint thing, rather than racing
to get there first and then
having to cut back on safety.
There are things that
could be done, maybe,
to facilitate biological
cognitive enhancement.
If that was the will,
you could certainly
imagine different kinds
of funding and policies
for accessing and linking
different databases that
could be done, and stuff like
that, that would be useful.
So there are potentially
cost-effective,
indirect ways of
approaching this problem,
in addition to directly
working on the control problem.
There are these other levers
that one could also consider.
Particularly on things that
we are still quite far away
from the relevant crunch time.
AUDIENCE: Hi, there.
I was just curious.
You're one of the
world's experts
in superintelligence and
the extensional risks.
Personally speaking,
informally, intuitively,
do you think we're
gonna make it?
[LAUGHTER]
NICK BOSTROM: Uh, yeah.
I mean, it's-- I
think that the, uh--
[LAUGHTER]
NICK BOSTROM: I mean, like,
I mean-- yeah, probably
like less than 50% risk of doom.
But I don't know exactly
what the number is.
I mean, the more important
question, I guess,
is what is the best
way to push it down.
So that's where most of the
mental energy is going into.
[LAUGHTER]
MODERATOR: So with that,
please thank our guest today.
[APPLAUSE]
MODERATOR: Thank you, Nick.
