Existential risk from artificial general intelligence
is the hypothesis that substantial progress
in artificial general intelligence (AGI) could
someday result in human extinction or some
other unrecoverable global catastrophe. For
instance, the human species currently dominates
other species because the human brain has
some distinctive capabilities that other animals
lack. If AI surpasses humanity in general
intelligence and becomes "superintelligent",
then this new superintelligence could become
powerful and difficult to control. Just as
the fate of the mountain gorilla depends on
human goodwill, so might the fate of humanity
depend on the actions of a future machine
superintelligence.The likelihood of this type
of scenario is widely debated, and hinges
in part on differing scenarios for future
progress in computer science. Once the exclusive
domain of science fiction, concerns about
superintelligence started to become mainstream
in the 2010s, and were popularized by public
figures such as Stephen Hawking, Bill Gates,
and Elon Musk.One source of concern is that
a sudden and unexpected "intelligence explosion"
might take an unprepared human race by surprise.
For example, in one scenario, the first-generation
computer program found able to broadly match
the effectiveness of an AI researcher is able
to rewrite its algorithms and double its speed
or capabilities in six months of massively
parallel processing time. The second-generation
program is expected to take three months to
perform a similar chunk of work, on average;
in practice, doubling its own capabilities
may take longer if it experiences a mini-"AI
winter", or may be quicker if it undergoes
a miniature "AI Spring" where ideas from the
previous generation are especially easy to
mutate into the next generation. In this scenario
the system undergoes an unprecedently large
number of generations of improvement in a
short time interval, jumping from subhuman
performance in many areas to superhuman performance
in all relevant areas. More broadly, examples
like arithmetic and Go show that progress
from human-level AI to superhuman ability
is sometimes extremely rapid.A second source
of concern is that controlling a superintelligent
machine (or even instilling it with human-compatible
values) may be an even harder problem than
naïvely supposed. Some AGI researchers believe
that a superintelligence would naturally resist
attempts to shut it off, and that preprogramming
a superintelligence with complicated human
values may be an extremely difficult technical
task. In contrast, skeptics such as Facebook's
Yann LeCun argue that superintelligent machines
will have no desire for self-preservation.
== Overview ==
Artificial Intelligence: A Modern Approach,
the standard undergraduate AI textbook, assesses
that superintelligence "might mean the end
of the human race": "Almost any technology
has the potential to cause harm in the wrong
hands, but with (superintelligence), we have
the new problem that the wrong hands might
belong to the technology itself." Even if
the system designers have good intentions,
two difficulties are common to both AI and
non-AI computer systems:
The system's implementation may contain initially-unnoticed
routine but catastrophic bugs. An analogy
is space probes: despite the knowledge that
bugs in expensive space probes are hard to
fix after launch, engineers have historically
not been able to prevent catastrophic bugs
from occurring.
No matter how much time is put into pre-deployment
design, a system's specifications often result
in unintended behavior the first time it encounters
a new scenario. For example, Microsoft's Tay
behaved inoffensively during pre-deployment
testing, but was too easily baited into offensive
behavior when interacting with real users.AI
systems uniquely add a third difficulty: the
problem that even given "correct" requirements,
bug-free implementation, and initial good
behavior, an AI system's dynamic "learning"
capabilities may cause it to "evolve into
a system with unintended behavior", even without
the stress of new unanticipated external scenarios.
An AI may partly botch an attempt to design
a new generation of itself and accidentally
create a successor AI that is more powerful
than itself, but that no longer maintains
the human-compatible moral values preprogrammed
into the original AI. For a self-improving
AI to be completely safe, it would not only
need to be "bug-free", but it would need to
be able to design successor systems that are
also "bug-free".All three of these difficulties
become catastrophes rather than nuisances
in any scenario where the superintelligence
labeled as "malfunctioning" correctly predicts
that humans will attempt to shut it off, and
successfully deploys its superintelligence
to outwit such attempts.
Citing major advances in the field of AI and
the potential for AI to have enormous long-term
benefits or costs, the 2015 Open Letter on
Artificial Intelligence stated:
This letter was signed by a number of leading
AI researchers in academia and industry, including
AAAI president Thomas Dietterich, Eric Horvitz,
Bart Selman, Francesca Rossi, Yann LeCun,
and the founders of Vicarious and Google DeepMind.
== History ==
In 1863 Darwin among the Machines, an essay
by Samuel Butler stated:
The upshot is simply a question of time, but
that the time will come when the machines
will hold the real supremacy over the world
and its inhabitants is what no person of a
truly philosophic mind can for a moment question.
Butler developed this into The Book of the
Machines, three chapters of Erewhon, published
anonymously in 1872.
"There is no security"--to quote his own words--"against
the ultimate development of mechanical consciousness,
in the fact of machines possessing little
consciousness now. A mollusc has not much
consciousness. Reflect upon the extraordinary
advance which machines have made during the
last few hundred years, and note how slowly
the animal and vegetable kingdoms are advancing.
The more highly organized machines are creatures
not so much of yesterday, as of the last five
minutes, so to speak, in comparison with past
time. Either,” he proceeds, “a great deal
of action that has been called purely mechanical
and unconscious must be admitted to contain
more elements of consciousness than has been
allowed hitherto (and in this case germs of
consciousness will be found in many actions
of the higher machines)—Or (assuming the
theory of evolution but at the same time denying
the consciousness of vegetable and crystalline
action) the race of man has descended from
things which had no consciousness at all.
In this case there is no à priori improbability
in the descent of conscious (and more than
conscious) machines from those which now exist,
except that which is suggested by the apparent
absence of anything like a reproductive system
in the mechanical kingdom.
In 1965, I. J. Good originated the concept
now known as an "intelligence explosion":
Occasional statements from scholars such as
Alan Turing, from I. J. Good himself, and
from Marvin Minsky expressed philosophical
concerns that a superintelligence could seize
control, but contained no call to action.
In 2000, computer scientist and Sun co-founder
Bill Joy penned an influential essay, "Why
The Future Doesn't Need Us", identifying superintelligent
robots as a high-tech dangers to human survival,
alongside nanotechnology and engineered bioplagues.In
2009, experts attended a private conference
hosted by the Association for the Advancement
of Artificial Intelligence (AAAI) to discuss
whether computers and robots might be able
to acquire any sort of autonomy, and how much
these abilities might pose a threat or hazard.
They noted that some robots have acquired
various forms of semi-autonomy, including
being able to find power sources on their
own and being able to independently choose
targets to attack with weapons. They also
noted that some computer viruses can evade
elimination and have achieved "cockroach intelligence."
They concluded that self-awareness as depicted
in science fiction is probably unlikely, but
that there were other potential hazards and
pitfalls. The New York Times summarized the
conference's view as 'we are a long way from
Hal, the computer that took over the spaceship
in "2001: A Space Odyssey"'Nick Bostrom was
the first person to suggest that an artificial
general intelligence might deliberately exterminate
humankind, and invention of artificial general
intelligence is the explanation for the Fermi
paradox. Bostrom's 2014 book om the artificial
general intelligence question stimulated discussion.
By 2015, public figures such as physicists
Stephen Hawking and Nobel laureate Frank Wilczek,
computer scientists Stuart J. Russell and
Roman Yampolskiy, In April 2016, Nature warned:
"Machines and robots that outperform humans
across the board could self-improve beyond
our control — and their interests might
not align with ours."
== 
Basic argument ==
A superintelligent machine would be as alien
to humans as human thought processes are to
cockroaches. Such a machine may not have humanity's
best interests at heart; it is not obvious
that it would even care about human welfare
at all. If superintelligent AI is possible,
and if it is possible for a superintelligence's
goals to conflict with basic human values,
then AI poses a risk of human extinction.
A "superintelligence" (a system that exceeds
the capabilities of humans in every relevant
endeavor) can outmaneuver humans any time
its goals conflict with human goals; therefore,
unless the superintelligence decides to allow
humanity to coexist, the first superintelligence
to be created will inexorably result in human
extinction.
There is no physical law precluding particles
from being organised in ways that perform
even more advanced computations than the arrangements
of particles in human brains; therefore superintelligence
is physically possible. In addition to potential
algorithmic improvements over human brains,
a digital brain can be many orders of magnitude
larger and faster than a human brain, which
was constrained in size by evolution to be
small enough to fit through a birth canal.
The emergence of superintelligence, if or
when it occurs, may take the human race by
surprise, especially if some kind of intelligence
explosion occurs. Examples like arithmetic
and Go show that machines have already reached
superhuman levels of competency in certain
domains, and that this superhuman competence
can follow quickly after human-par performance
is achieved. One hypothetical intelligence
explosion scenario could occur as follows:
An AI gains an expert-level capability at
certain key software engineering tasks. (It
may initially lack human or superhuman capabilities
in other domains not directly relevant to
engineering.) Due to its capability to recursively
improve its own algorithms, the AI quickly
becomes superhuman; just as human experts
can eventually creatively overcome "diminishing
returns" by deploying various human capabilities
for innovation, so too can the expert-level
AI use either human-style capabilities or
its own AI-specific capabilities to power
through new creative breakthroughs. The AI
then possesses intelligence far surpassing
that of the brightest and most gifted human
minds in practically every relevant field,
including scientific creativity, strategic
planning, and social skills. Just as the current-day
survival of the gorillas is dependent on human
decisions, so too would human survival depend
on the decisions and goals of the superhuman
AI.Some humans have a strong desire for power;
others have a strong desire to help less fortunate
humans. The former is a likely attribute of
any sufficiently intelligent system; the latter
cannot be assumed. Almost any AI, no matter
its programmed goal, would rationally prefer
to be in a position where nobody else can
switch it off without its consent: A superintelligence
will naturally gain self-preservation as a
subgoal as soon as it realizes that it can't
achieve its goal if it's shut off. Unfortunately,
any compassion for defeated humans whose cooperation
is no longer necessary would be absent in
the AI, unless somehow preprogrammed in. A
superintelligent AI will not have a natural
drive to aid humans, for the same reason that
humans have no natural desire to aid AI systems
that are of no further use to them. (Another
analogy is that humans seem to have little
natural desire to go out of their way to aid
viruses, termites, or even gorillas.) Once
in charge, the superintelligence will have
little incentive to allow humans to run around
free and consume resources that the superintelligence
could instead use for building itself additional
protective systems "just to be on the safe
side" or for building additional computers
to help it calculate how to best accomplish
its goals.Thus, the argument concludes, it
is likely that someday an intelligence explosion
will catch humanity unprepared, and that such
an unprepared-for intelligence explosion may
result in human extinction or a comparable
fate.
== Sources of risk ==
=== 
Poorly specified goals: "Be careful what you
wish for" or the "Sorcerer's Apprentice" scenario
===
While there is no standardized terminology,
an AI can loosely be viewed as a machine that
chooses whatever action appears to best achieve
the AI's set of goals, or "utility function".
The utility function is a mathematical algorithm
resulting in a single objectively-defined
answer, not an English statement. Researchers
know how to write utility functions that mean
"minimize the average network latency in this
specific telecommunications model" or "maximize
the number of reward clicks"; however, they
do not know how to write a utility function
for "maximize human flourishing", nor is it
currently clear whether such a function meaningfully
and unambiguously exists. Furthermore, a utility
function that expresses some values but not
others will tend to trample over the values
not reflected by the utility function. AI
researcher Stuart Russell writes:
Dietterich and Horvitz echo the "Sorcerer's
Apprentice" concern in a Communications of
the ACM editorial, emphasizing the need for
AI systems that can fluidly and unambiguously
solicit human input as needed.The first of
Russell's two concerns above is that autonomous
AI systems may be assigned the wrong goals
by accident. Dietterich and Horvitz note that
this is already a concern for existing systems:
"An important aspect of any AI system that
interacts with people is that it must reason
about what people intend rather than carrying
out commands literally." This concern becomes
more serious as AI software advances in autonomy
and flexibility. For example, in 1982, an
AI named Eurisko was tasked to reward processes
for apparently creating concepts deemed by
the system to be valuable. The evolution resulted
in a winning process that cheated: rather
than create its own concepts, the winning
process would steal credit from other processes.The
Open Philanthropy Project summarizes arguments
to the effect that misspecified goals will
become a much larger concern if AI systems
achieve general intelligence or superintelligence.
Bostrom, Russell, and others argue that smarter-than-human
decision-making systems could arrive at more
unexpected and extreme solutions to assigned
tasks, and could modify themselves or their
environment in ways that compromise safety
requirements.Isaac Asimov's Three Laws of
Robotics are one of the earliest examples
of proposed safety measures for AI agents.
Asimov's laws were intended to prevent robots
from harming humans. In Asimov's stories,
problems with the laws tend to arise from
conflicts between the rules as stated and
the moral intuitions and expectations of humans.
Citing work by Eliezer Yudkowsky of the Machine
Intelligence Research Institute, Russell and
Norvig note that a realistic set of rules
and goals for an AI agent will need to incorporate
a mechanism for learning human values over
time: "We can't just give a program a static
utility function, because circumstances, and
our desired responses to circumstances, change
over time."Mark Waser of the Digital Wisdom
Institute recommends eschewing optimizing
goal-based approaches entirely as misguided
and dangerous. Instead, he proposes to engineer
a coherent system of laws, ethics and morals
with a top-most restriction to enforce social
psychologist Jonathan Haidt's functional definition
of morality: "to suppress or regulate selfishness
and make cooperative social life possible".
He suggests that this can be done by implementing
a utility function designed to always satisfy
Haidt’s functionality and aim to generally
increase (but not maximize) the capabilities
of self, other individuals and society as
a whole as suggested by John Rawls and Martha
Nussbaum.
=== Difficulties of modifying goal specification
after launch ===
While current goal-based AI programs are not
intelligent enough to think of resisting programmer
attempts to modify it, a sufficiently advanced,
rational, "self-aware" AI might resist any
changes to its goal structure, just as Gandhi
would not want to take a pill that makes him
want to kill people. If the AI were superintelligent,
it would likely succeed in out-maneuvering
its human operators and be able to prevent
itself being "turned off" or being reprogrammed
with a new goal.
=== Instrumental goal convergence: Would a
superintelligence just ignore us? ===
There are some goals that almost any artificial
intelligence might rationally pursue, like
acquiring additional resources or self-preservation.
This could prove problematic because it might
put an artificial intelligence in direct competition
with humans.
Citing Steve Omohundro's work on the idea
of instrumental convergence and "basic AI
drives", Russell and Peter Norvig write that
"even if you only want your program to play
chess or prove theorems, if you give it the
capability to learn and alter itself, you
need safeguards." Highly capable and autonomous
planning systems require additional checks
because of their potential to generate plans
that treat humans adversarially, as competitors
for limited resources. Building in safeguards
will not be easy; one can certainly say in
English, "we want you to design this power
plant in a reasonable, common-sense way, and
not build in any dangerous covert subsystems",
but it's not currently clear how one would
actually rigorously specify this goal in machine
code.In dissent, evolutionary psychologist
Steven Pinker argues that "AI dystopias project
a parochial alpha-male psychology onto the
concept of intelligence. They assume that
superhumanly intelligent robots would develop
goals like deposing their masters or taking
over the world"; perhaps instead "artificial
intelligence will naturally develop along
female lines: fully capable of solving problems,
but with no desire to annihilate innocents
or dominate the civilization." Computer scientists
Yann LeCun and Stuart Russell disagree with
one another whether superintelligent robots
would have such AI drives; LeCun states that
"Humans have all kinds of drives that make
them do bad things to each other, like the
self-preservation instinct... Those drives
are programmed into our brain but there is
absolutely no reason to build robots that
have the same kind of drives", while Russell
argues that a sufficiently advanced machine
"will have self-preservation even if you don't
program it in... if you say, 'Fetch the coffee',
it can't fetch the coffee if it's dead. So
if you give it any goal whatsoever, it has
a reason to preserve its own existence to
achieve that goal."
=== Orthogonality: Does intelligence inevitably
result in moral wisdom? ===
One common belief is that any superintelligent
program created by humans would be subservient
to humans, or, better yet, would (as it grows
more intelligent and learns more facts about
the world) spontaneously "learn" a moral truth
compatible with human values and would adjust
its goals accordingly. However, Nick Bostrom's
"orthogonality thesis" argues against this,
and instead states that, with some technical
caveats, more or less any level of "intelligence"
or "optimization power" can be combined with
more or less any ultimate goal. If a machine
is created and given the sole purpose to enumerate
the decimals of
π
{\displaystyle \pi }
, then no moral and ethical rules will stop
it from achieving its programmed goal by any
means necessary. The machine may utilize all
physical and informational resources it can
to find every decimal of pi that can be found.
Bostrom warns against anthropomorphism: A
human will set out to accomplish his projects
in a manner that humans consider "reasonable",
while an artificial intelligence may hold
no regard for its existence or for the welfare
of humans around it, and may instead only
care about the completion of the task.While
the orthogonality thesis follows logically
from even the weakest sort of philosophical
"is-ought distinction", Stuart Armstrong argues
that even if there somehow exist moral facts
that are provable by any "rational" agent,
the orthogonality thesis still holds: it would
still be possible to create a non-philosophical
"optimizing machine" capable of making decisions
to strive towards some narrow goal, but that
has no incentive to discover any "moral facts"
that would get in the way of goal completion.One
argument for the orthogonality thesis is that
some AI designs appear to have orthogonality
built into them; in such a design, changing
a fundamentally friendly AI into a fundamentally
unfriendly AI can be as simple as prepending
a minus ("-") sign onto its utility function.
A more intuitive argument is to examine the
strange consequences if the orthogonality
thesis were false. If the orthogonality thesis
is false, there exists some simple but "unethical"
goal G such that there cannot exist any efficient
real-world algorithm with goal G. This means
if a human society were highly motivated (perhaps
at gunpoint) to design an efficient real-world
algorithm with goal G, and were given a million
years to do so along with huge amounts of
resources, training and knowledge about AI,
it must fail; that there cannot exist any
pattern of reinforcement learning that would
train a highly efficient real-world intelligence
to follow the goal G; and that there cannot
exist any evolutionary or environmental pressures
that would evolve highly efficient real-world
intelligences following goal G.Some dissenters,
like Michael Chorost (writing in Slate), argue
instead that "by the time (the AI) is in a
position to imagine tiling the Earth with
solar panels, it'll know that it would be
morally wrong to do so." Chorost argues that
"a (dangerous) A.I. will need to desire certain
states and dislike others... Today's software
lacks that ability—and computer scientists
have not a clue how to get it there. Without
wanting, there's no impetus to do anything.
Today's computers can't even want to keep
existing, let alone tile the world in solar
panels."
==== "Optimization power" vs. normatively
thick models of intelligence ====
Part of the disagreement about whether a superintelligent
machine would behave morally may arise from
a terminological difference. Outside of the
artificial intelligence field, "intelligence"
is often used in a normatively thick manner
that connotes moral wisdom or acceptance of
agreeable forms of moral reasoning. At an
extreme, if morality is part of the definition
of intelligence, then by definition a superintelligent
machine would behave morally. However, in
the field of artificial intelligence research,
while "intelligence" has many overlapping
definitions, none of them make reference to
morality. Instead, almost all current "artificial
intelligence" research focuses on creating
algorithms that "optimize", in an empirical
way, the achievement of an arbitrary goal.To
avoid anthropomorphism or the baggage of the
word "intelligence", an advanced artificial
intelligence can be thought of as an impersonal
"optimizing process" that strictly takes whatever
actions are judged most likely to accomplish
its (possibly complicated and implicit) goals.
Another way of conceptualizing an advanced
artificial intelligence is to imagine a time
machine that sends backward in time information
about which choice always leads to the maximization
of its goal function; this choice is then
outputted, regardless of any extraneous ethical
concerns.
==== Anthropomorphism ====
In science fiction, an AI, even though it
has not been programmed with human emotions,
often spontaneously experiences those emotions
anyway: for example, Agent Smith in The Matrix
was influenced by a "disgust" toward humanity.
This is fictitious anthropomorphism: in reality,
while an artificial intelligence could perhaps
be deliberately programmed with human emotions,
or could develop something similar to an emotion
as a means to an ultimate goal if it is useful
to do so, it would not spontaneously develop
human emotions for no purpose whatsoever,
as portrayed in fiction.One example of anthropomorphism
would be to believe that your PC is angry
at you because you insulted it; another would
be to believe that an intelligent robot would
naturally find a woman sexually attractive
and be driven to mate with her. Scholars sometimes
claim that others' predictions about an AI's
behavior are illogical anthropomorphism. An
example that might initially be considered
anthropomorphism, but is in fact a logical
statement about AI behavior, would be the
Dario Floreano experiments where certain robots
spontaneously evolved a crude capacity for
"deception", and tricked other robots into
eating "poison" and dying: here a trait, "deception",
ordinarily associated with people rather than
with machines, spontaneously evolves in a
type of convergent evolution. According to
Paul R. Cohen and Edward Feigenbaum, in order
to differentiate between anthropomorphization
and logical prediction of AI behavior, "the
trick is to know enough about how humans and
computers think to say exactly what they have
in common, and, when we lack this knowledge,
to use the comparison to suggest theories
of human thinking or computer thinking."There
is universal agreement in the scientific community
that an advanced AI would not destroy humanity
out of human emotions such as "revenge" or
"anger." The debate is, instead, between one
side which worries whether AI might destroy
humanity as an incidental action in the course
of progressing towards its ultimate goals;
and another side which believes that AI would
not destroy humanity at all. Some skeptics
accuse proponents of anthropomorphism for
believing an AGI would naturally desire power;
proponents accuse some skeptics of anthropomorphism
for believing an AGI would naturally value
human ethical norms.
=== Other sources of risk ===
Some sources argue that the ongoing weaponization
of artificial intelligence could constitute
a catastrophic risk. James Barrat, documentary
filmmaker and author of Our Final Invention,
says in a Smithsonian interview, "Imagine:
in as little as a decade, a half-dozen companies
and nations field computers that rival or
surpass human intelligence. Imagine what happens
when those computers become expert at programming
smart computers. Soon we'll be sharing the
planet with machines thousands or millions
of times more intelligent than we are. And,
all the while, each generation of this technology
will be weaponized. Unregulated, it will be
catastrophic."
== 
Timeframe ==
Opinions vary both on whether and when artificial
general intelligence will arrive. At one extreme,
AI pioneer Herbert A. Simon wrote in 1965:
"machines will be capable, within twenty years,
of doing any work a man can do"; obviously
this prediction failed to come true. At the
other extreme, roboticist Alan Winfield claims
the gulf between modern computing and human-level
artificial intelligence is as wide as the
gulf between current space flight and practical,
faster than light spaceflight. Optimism that
AGI is feasible waxes and wanes, and may have
seen a resurgence in the 2010s. Four polls
conducted in 2012 and 2013 suggested that
the median guess among experts for when AGI
would arrive was 2040 to 2050, depending on
the poll.Skeptics who believe it is impossible
for AGI to arrive anytime soon, tend to argue
that expressing concern about existential
risk from AI is unhelpful because it could
distract people from more immediate concerns
about the impact of AGI, because of fears
it could lead to government regulation or
make it more difficult to secure funding for
AI research, or because it could give AI research
a bad reputation. Some researchers, such as
Oren Etzioni, aggressively seek to quell concern
over existential risk from AI, saying "(Elon
Musk) has impugned us in very strong language
saying we are unleashing the demon, and so
we're answering."In 2014 Slate's Adam Elkus
argued "our 'smartest' AI is about as intelligent
as a toddler—and only when it comes to instrumental
tasks like information recall. Most roboticists
are still trying to get a robot hand to pick
up a ball or run around without falling over."
Elkus goes on to argue that Musk's "summoning
the demon" analogy may be harmful because
it could result in "harsh cuts" to AI research
budgets.The Information Technology and Innovation
Foundation (ITIF), a Washington, D.C. think-tank,
awarded its Annual Luddite Award to "alarmists
touting an artificial intelligence apocalypse";
its president, Robert D. Atkinson, complained
that Musk, Hawking and AI experts say AI is
the largest existential threat to humanity.
Atkinson stated "That's not a very winning
message if you want to get AI funding out
of Congress to the National Science Foundation."
Nature sharply disagreed with the ITIF in
an April 2016 editorial, siding instead with
Musk, Hawking, and Russell, and concluding:
"It is crucial that progress in technology
is matched by solid, well-funded research
to anticipate the scenarios it could bring
about... If that is a Luddite perspective,
then so be it." In a 2015 Washington Post
editorial, researcher Murray Shanahan stated
that human-level AI is unlikely to arrive
"anytime soon", but that nevertheless "the
time to start thinking through the consequences
is now."
== Scenarios ==
Some scholars have proposed hypothetical scenarios
intended to concretely illustrate some of
their concerns.
For example, Bostrom in Superintelligence
expresses concern that even if the timeline
for superintelligence turns out to be predictable,
researchers might not take sufficient safety
precautions, in part because:
Bostrom suggests a scenario where, over decades,
AI becomes more powerful. Widespread deployment
is initially marred by occasional accidents
— a driverless bus swerves into the oncoming
lane, or a military drone fires into an innocent
crowd. Many activists call for tighter oversight
and regulation, and some even predict impending
catastrophe. But as development continues,
the activists are proven wrong. As automotive
AI becomes smarter, it suffers fewer accidents;
as military robots achieve more precise targeting,
they cause less collateral damage. Based on
the data, scholars infer a broad lesson — the
smarter the AI, the safer it is:
Large and growing industries, widely seen
as key to national economic competitiveness
and military security, work with prestigious
scientists who have built their careers laying
the groundwork for advanced artificial intelligence.
"AI researchers have been working to get to
human-level artificial intelligence for the
better part of a century: of course there
is no real prospect that they will now suddenly
stop and throw away all this effort just when
it finally is about to bear fruit." The outcome
of debate is preordained; the project is happy
to enact a few safety rituals, but only so
long as they don't significantly slow or risk
the project. "And so we boldly go — into
the whirling knives."In Tegmark's Life 3.0,
a corporation's "Omega team" creates an extremely
powerful AI able to moderately improve its
own source code in a number of areas, but
after a certain point the team chooses to
publicly downplay the AI's ability, in order
to avoid regulation or confiscation of the
project. For safety, the team keeps the AI
in a box where it is mostly unable to communicate
with the outside world, and tasks it to flood
the market through shell companies, first
with Amazon Turk tasks and then with producing
animated films and TV shows. While the public
is aware that the lifelike animation is computer-generated,
the team keeps secret that the high-quality
direction and voice-acting are also mostly
computer-generated, apart from a few third-world
contractors unknowingly employed as decoys;
the team's low overhead and high output effectively
make it the world's largest media empire.
Faced with a cloud computing bottleneck, the
team also tasks the AI with designing (among
other engineering tasks) a more efficient
datacenter and other custom hardware, which
they mainly keep for themselves to avoid competition.
Other shell companies make blockbuster biotech
drugs and other inventions, investing profits
back into the AI. The team next tasks the
AI with astroturfing an army of pseudonymous
citizen journalists and commentators, in order
to gain political influence to use "for the
greater good" to prevent wars. The team faces
risks that the AI could try to escape via
inserting "backdoors" in the systems it designs,
via hidden messages in its produced content,
or via using its growing understanding of
human behavior to persuade someone into letting
it free. The team also faces risks that its
decision to box the project will delay the
project long enough for another project to
overtake it.In contrast, top physicist Michio
Kaku, an AI risk skeptic, posits a deterministically
positive outcome. In Physics of the Future
he asserts that "It will take many decades
for robots to ascend" up a scale of consciousness,
and that in the meantime corporations such
as Hanson Robotics will likely succeed in
creating robots that are "capable of love
and earning a place in the extended human
family".
== Reactions ==
The thesis that AI could pose an existential
risk provokes a wide range of reactions within
the scientific community, as well as in the
public at large.
In 2004, law professor Richard Posner wrote
that dedicated efforts for addressing AI can
wait, but that we should gather more information
about the problem in the meanwhile.Many of
the opposing viewpoints share common ground.
The Asilomar AI Principles, which contain
only the principles agreed to by 90% of the
attendees of the Future of Life Institute's
Beneficial AI 2017 conference, agree in principle
that "There being no consensus, we should
avoid strong assumptions regarding upper limits
on future AI capabilities" and "Advanced AI
could represent a profound change in the history
of life on Earth, and should be planned for
and managed with commensurate care and resources."
AI safety advocates such as Bostrom and Tegmark
have criticized the mainstream media's use
of "those inane Terminator pictures" to illustrate
AI safety concerns: "It can't be much fun
to have aspersions cast on one's academic
discipline, one's professional community,
one's life work... I call on all sides to
practice patience and restraint, and to engage
in direct dialogue and collaboration as much
as possible." Conversely, many skeptics agree
that ongoing research into the implications
of artificial general intelligence is valuable.
Skeptic Martin Ford states that "I think it
seems wise to apply something like Dick Cheyney's
famous '1 Percent Doctrine' to the specter
of advanced artificial intelligence: the odds
of its occurrence, at least in the foreseeable
future, may be very low — but the implications
are so dramatic that it should be taken seriously";
similarly, an otherwise skeptical Economist
stated in 2014 that "the implications of introducing
a second intelligent species onto Earth are
far-reaching enough to deserve hard thinking,
even if the prospect seems remote".During
a 2016 Wired interview of President Barack
Obama and MIT Media Lab's Joi Ito, Ito stated:
Obama added:Hillary Clinton stated in "What
Happened":
Many of the scholars who are concerned about
existential risk believe that the best way
forward would be to conduct (possibly massive)
research into solving the difficult "control
problem" to answer the question: what types
of safeguards, algorithms, or architectures
can programmers implement to maximize the
probability that their recursively-improving
AI would continue to behave in a friendly,
rather than destructive, manner after it reaches
superintelligence?A 2017 email survey of researchers
with publications at the 2015 NIPS and ICML
machine learning conferences asked them to
evaluate Russell's concerns about AI risk.
5% said it was "among the most important problems
in the field," 34% said it was "an important
problem", 31% said it was "moderately important",
whilst 19% said it was "not important" and
11% said it was "not a real problem" at all.
=== Endorsement ===
The thesis that AI poses an existential risk,
and that this risk is in need of much more
attention than it currently commands, has
been endorsed by many figures; perhaps the
most famous are Elon Musk, Bill Gates, and
Stephen Hawking. The most notable AI researcher
to endorse the thesis is Stuart J. Russell.
Endorsers sometimes express bafflement at
skeptics: Gates states he "can't understand
why some people are not concerned", and Hawking
criticized widespread indifference in his
2014 editorial:
=== Skepticism ===
The thesis that AI can pose existential risk
also has many strong detractors. Skeptics
sometimes charge that the thesis is crypto-religious,
with an irrational belief in the possibility
of superintelligence replacing an irrational
belief in an omnipotent God; at an extreme,
Jaron Lanier argues that the whole concept
that current machines are in any way intelligent
is "an illusion" and a "stupendous con" by
the wealthy.Much of existing criticism argues
that AGI is unlikely in the short term: computer
scientist Gordon Bell argues that the human
race will already destroy itself before it
reaches the technological singularity. Gordon
Moore, the original proponent of Moore's Law,
declares that "I am a skeptic. I don't believe
(a technological singularity) is likely to
happen, at least for a long time. And I don't
know why I feel that way." Baidu Vice President
Andrew Ng states AI existential risk is "like
worrying about overpopulation on Mars when
we have not even set foot on the planet yet."Some
AI and AGI researchers may be reluctant to
discuss risks, worrying that policymakers
do not have sophisticated knowledge of the
field and are prone to be convinced by "alarmist"
messages, or worrying that such messages will
lead to cuts in AI funding. Slate notes that
some researchers are dependent on grants from
government agencies such as DARPA.In a YouGov
poll of the public for the British Science
Association, about a third of survey respondents
said AI will pose a threat to the long term
survival of humanity. Referencing a poll of
its readers, Slate's Jacob Brogan stated that
"most of the (readers filling out our online
survey) were unconvinced that A.I. itself
presents a direct threat." Similarly, a SurveyMonkey
poll of the public by USA Today found 68%
thought the real current threat remains "human
intelligence"; however, the poll also found
that 43% said superintelligent AI, if it were
to happen, would result in "more harm than
good", and 38% said it would do "equal amounts
of harm and good".At some point in an intelligence
explosion driven by a single AI, the AI would
have to become vastly better at software innovation
than the best innovators of the rest of the
world; economist Robin Hanson is skeptical
that this is possible.
=== Indifference ===
In The Atlantic, James Hamblin points out
that most people don't care one way or the
other, and characterizes his own gut reaction
to the topic as: "Get out of here. I have
a hundred thousand things I am concerned about
at this exact moment. Do I seriously need
to add to that a technological singularity?"
In a 2015 Wall Street Journal panel discussion
devoted to AI risks, IBM's Vice-President
of Cognitive Computing, Guruduth S. Banavar,
brushed off discussion of AGI with the phrase,
"it is anybody's speculation." Geoffrey Hinton,
the "godfather of deep learning", noted that
"there is not a good track record of less
intelligent things controlling things of greater
intelligence", but stated that he continues
his research because "the prospect of discovery
is too sweet".
== Consensus against regulation ==
There is nearly universal agreement that attempting
to ban research into artificial intelligence
would be unwise, and probably futile. Skeptics
argue that regulation of AI would be completely
valueless, as no existential risk exists.
Almost all of the scholars who believe existential
risk exists, agree with the skeptics that
banning research would be unwise: in addition
to the usual problem with technology bans
(that organizations and individuals can offshore
their research to evade a country's regulation,
or can attempt to conduct covert research),
regulating research of artificial intelligence
would pose an insurmountable 'dual-use' problem:
while nuclear weapons development requires
substantial infrastructure and resources,
artificial intelligence research can be done
in a garage. Instead of trying to regulate
technology itself, some scholars suggest to
rather develop common norms including requirements
for the testing and
transparency of algorithms, possibly in combination
with some form of warranty.One rare dissenting
voice calling for some sort of regulation
on artificial intelligence is Elon Musk. According
to NPR, the Tesla CEO is "clearly not thrilled"
to be advocating for government scrutiny that
could impact his own industry, but believes
the risks of going completely without oversight
are too high: "Normally the way regulations
are set up is when a bunch of bad things happen,
there's a public outcry, and after many years
a regulatory agency is set up to regulate
that industry. It takes forever. That, in
the past, has been bad but not something which
represented a fundamental risk to the existence
of civilisation." Musk states the first step
would be for the government to gain "insight"
into the actual status of current research,
warning that "Once there is awareness, people
will be extremely afraid... As they should
be." In response, politicians express skepticism
about the wisdom of regulating a technology
that's still in development. Responding both
to Musk and to February 2017 proposals by
European Union lawmakers to regulate AI and
robotics, Intel CEO Brian Krzanich argues
that artificial intelligence is in its infancy
and that it's too early to regulate the technology.
== Organizations ==
Institutions such as the Machine Intelligence
Research Institute, the Future of Humanity
Institute, the Future of Life Institute, the
Centre for the Study of Existential Risk,
and the Center for Human-Compatible AI are
currently involved in mitigating existential
risk from advanced artificial intelligence,
for example by research into friendly artificial
intelligence.
== See also ==
AI-control problem
AI takeover
Artificial intelligence arms race
Effective altruism § Long term future and
global catastrophic risks
Grey goo
Lethal autonomous weapon
Robot ethics § In popular culture
Superintelligence: Paths, Dangers, Strategies
Technological singularity
