- So, good evening, everybody.
Welcome to Stern.
So, my name is Natalia Levina.
I'm the director of
the Fubon Center for Technology,
Business and Innovation.
If you haven't heard
about the Fubon Center yet
or registered without
noticing the center's profile,
our center has launched in April
of 2018.
So, we are, strictly speaking,
right about one year old.
The mission of our center is to
support
interdisciplinary work related
to
the topics of technology and
innovation.
We have.
I am lucky to have three amazing
co-directors of the center.
Two of whom are here.
Kathleen DeRose, she's leading
the Fintech part of our center.
And, in a minute, I'll
introduce my colleague,
Professor Foster Provost.
And we also like you to have
Professor Melissa Schilling.
Who I'm sure some of you
had as your instructor.
As I see many familiar
faces of my former students.
And, with that said, we are very
fortunate
to launch our new series on
artificial intelligence in
business.
Which is led by Professor Foster
Provost.
And this is the first
of our series of talks.
Today's talk, as you know,
focuses on machine learning
and artificial intelligence.
On the April 15, in case you
missed it,
it's in your bookmarks,
we have the next talk.
Which is focusing on the
topic of algorithmic fairness.
And, in September,
in case you forget about
us over the summer,
we will have another talk
focused on the issues of
artificial intelligence
and the changing nature of work.
You can find wonderful videos
from our prior events on our
website.
So, if you missed our Fintech
conference,
which was held in October,
all the talks and notes for
the talks you can find there.
And, finally, in case you really
want to
capture and share the
great insights from today
with your colleagues,
we will have a full
detail of today's event
posted in a few days.
And, for anybody who's
registered
on site or ahead of time
will receive a link to the
video.
So, don't worry if you miss a
few phrases.
You'll have the video.
Without further ado, I'd like to
introduce
my colleague and co-director
of the Fubon Center,
Professor Foster Provost.
Professor Provost is
professor of data science
and information systems.
He's Andre Meyer Faculty Fellow
at NYU Stern School of Business.
He is one of the co-directors
of the Fubon Center.
And he's also the former
director
of the Center for Data Science
that some of you might have
heard about.
Which is one of the early day
interdisciplinary initiatives
around data science at NYU.
He's previously served
as an editor in chief
of the Journal of Machine
Learning
and was elected as a founding
member
of the International
Machine Learning Society.
Foster's research has won
numerous awards.
He has won, most recently, 2017,
European Researcher
Paper of the Year Award.
2016, he won the best paper
in one of our top journals,
Information Systems Research.
Perhaps worth mentioning.
A particularly important best
paper award.
He received the best paper
award from ACM SIGKDD Conference
across three decades.
So, in academia, we
think that's real quality
when you win the best
paper over 30 years award.
Professor Provost has extended
experience
not only in academia,
but also in business.
He's worked for five years in
the industry
after receiving his PhD
in computer science.
In particular, as part of that
experience,
he won the president's award
at NYNEX Science and Technology.
That's now Verizon.
In case people are young and
don't know what NYNEX is.
His book, Data Science for
Business,
is a perennial best seller
and is used in many,
many data science courses
across the world.
And in business schools and
beyond.
Foster has designed and.
(computer chime)
Sorry.
Foster also.
Oh, yes, I am almost done.
You have a lot of good
things I can say about you.
(laughter)
So, I just want to mention
that Professor Provost
has designed data science
architecture
for a number of startups.
You can find them on his
website.
But, most notable and a recent
award,
is that CFO Tech Outlook 2019
named one of the companies
he's involved with now,
Detectica, as one of
the biggest innovators
of the year in AI.
So, without further ado.
And sorry for a lengthy
introduction.
I'd like to welcome Professor
Provost.
(applause)
- Thanks a lot, Natalia.
And thanks, everybody.
So, Natalia mentioned that we
have.
This is our inaugural event in
our series
on demystifying AI and machine
learning.
Artificial intelligence
and machine learning.
I don't know about you.
But I seem to be encountering
them more and more
when I read the newspaper
or when I read Business Week
or when I read just about
anything.
And I've been studying and
building AI systems for 30
years.
And I'm mystified by what
the heck this stuff is.
I read articles.
I don't know what the heck is
going on.
You know, it seems like
either there's some dark magic
that's sort of inside there that
is going to automate away all of
our jobs.
As apparently some of my
colleagues in business schools
are telling their students.
And we should either be
terrified by this
or anxiously awaiting all the
leisure
that this is going to give us.
So, some people think there's
gonna be
cars driving themselves around
the streets of Manhattan.
And there's actual discussion,
not just science fiction,
about when is it that machines
are gonna become smarter than we
are?
By the way, aren't they already?
So, we thought we would
try to do our part here
at the Fubon Center to demystify
this at least a little bit
piece by piece.
And, to that end, I have
got the great pleasure
of welcoming my friend and
colleague, Pedro Domingos.
Pedro is professor in computer
science
at the University of Washington.
And he's head of machine
learning research
here at D.E. Shaw.
And he's the author of the
book The Master Algorithm:
How the Quest for the
Ultimate Learning Machine
Will Remake Our World.
Which, if you haven't
read it yet, read it.
It's great.
It was.
I guess it came out in 2015.
And, in 2016, Bill Gates
recommended it
as one of the two books
that you have to read
about artificial intelligence.
And I think, last year,
China's president, Xi Jinping,
was caught with it on his
bookshelf.
I don't actually know how you
get caught,
as president of China,
get caught with something
on your bookshelf.
But I really like that part of
the story.
So, Pedro and I got to know each
other
as young AI guys back in the
mid-90s
when we continued to sort of
work on
similar topics over and over
again.
Pedro has won the top innovation
award in data science.
He's a fellow of the Association
for
the Advancement of
Artificial Intelligence.
He's received a Fulbright
Scholarship, a Sloan Fellowship,
the National Science
Foundation's Career Award,
and lots of best paper awards.
His papers.
I tell my students.
If it's written by Pedro
Domingos, just read it.
So, one more thing.
Natalia mentioned this.
But I just wanted to give
another plug
before we actually see the
part of the show you came for,
which is Pedro's talk.
Yes, in two weeks, we have
the second installment
of our series.
Because tonight's was
postponed due to the snow day,
they're close to each other.
But we're gonna have Solon
Barocas,
who is faculty at Cornell
and a researcher at New York
City's
Lab of Microsoft Research
and is one of the world's expert
at the confluence of ethics,
law, and machine learning.
And he's gonna come and
talk to us about the fact,
as you'll learn a little from
Pedro,
a machine learning algorithm
is a computer program
that writes other computer
programs.
And, increasingly, the
computer programs that
are doing things in
business and in government
are actually written by
other computer programs,
we have to ask ourselves what
are the ethical implications
and is there even a
strong ethical foundation?
And this is what Solon is an
expert in.
And so, he's going to come and
talk to us
about algorithmic fairness and
business.
And so, that'll be two
weeks from this evening.
Okay, so, without further ado,
let me turn the mic over to
Pedro.
And thanks so much for
spending some time with us.
(applause)
- Alright, thank you,
everybody, for being here.
Thanks, Foster, for bringing me
here.
It's a particular pleasure
given how long we've known each
other
and how different things were
when we started out doing AI.
So, as Foster mentioned,
there's a lot of
excitement about AI today.
But also a lot of confusion.
Some people, like Sunat
Bechinase or Google,
thinks it's the greatest
invention in history since fire.
Others think it's a bunch of
hype.
There's a lot of discussion.
Some of it quite heated.
There's also a lot of people
who have to make decisions.
Business decisions, policy
decisions,
personal and financial decisions
that have to do with AI.
And, at the same time,
people's understanding of AI
is really incomplete and
often at odds with reality.
So, what I would like to
do, in the next 45 minutes,
is maybe try to give a
little bit of perspective
on what AI really is and
what it can and can't do.
'Cause I think that's one of
the biggest points of confusion
is what can we do with AI today
and what can we not yet?
And then I think, with that
understanding,
then all our debates and
decisions
can be much more informed.
So, what is AI?
AI is getting computers to do
things
that previously could
only be done by humans.
Well, that's a very ambitious
and also very broad goal.
It includes things like, for
example,
reasoning, problem solving,
planning,
acquiring common-sense
knowledge and using it,
understanding speech and
language,
understanding what we see in the
world,
being able to navigate through
the world
and manipulate things,
and, very importantly, learning.
And, associated with each
of these capabilities,
there's a subfield of AI.
Particularly in machine
learning,
which Foster and I have both
worked in,
is really the subfield
of AI that deals with
getting computers to learn.
Now, there have been two major
eras of AI.
There was the AI Summer circa
the '80s.
And that was the first coming of
AI.
But it didn't really succeed.
And the reason we are
where we are today is that
the field changed in how
it approaches problems.
The approach that kind of
prevailed in the '70s and '80s
was what were and now called
knowledge-based systems.
And the idea here is that
we manually encode into the
computer
all the knowledge that it
needs to solve problems.
If we wanted to do medical
diagnosis,
we interview a bunch of doctors
and write down the rules by
which they diagnose people.
This unfortunately failed
because of what is called
the knowledge acquisition
bottleneck.
There's just too much knowledge
and you're never done acquiring
more and more knowledge,
more and more rules.
And it's never enough.
It's always very brittle.
This caused us to change our
emphasis
to doing something else.
And this started really
taking off in the '90s.
And then, today, it's
what really underlies
all the big successes of AI.
Is to do machine learning.
And the idea of machine learning
is that
we're not going to program
the computer how to do things.
We're gonna have the
computer learn from data.
By observing people.
By trying to figure out what's
going on.
And this tends to work much
better.
And, in particular, it scales
much better.
Because, as we get more data,
in some sense, we get
more knowledge for free.
And, as the amount of data
available
has been exploding in recent
years,
computers have started
getting way and way smarter
by taking advantage of it
using machine learning.
So, this may seem like black
magic, like Foster was saying.
How do computers do that?
How exactly does a
computer learn from data?
Well, there's five main ways.
One of them is to just work
a little bit like scientists
using the scientific method.
Fill in the gaps in
your existing knowledge.
Make hypotheses.
Test them.
Refine them.
This is one.
Probably the oldest one.
Another one, which is actually
the dominant one today,
is to emulate the brain.
Your brain is the
greatest learning machine
the world has ever seen.
So, as usual, when engineers
are behind the competition,
what they do is they reverse
engineer it.
So, let's reverse
engineer the competition.
And this is what is called deep
learning.
It's also known as neural
networks and connectionism.
Another one is to simulate
evolution.
Evolution produces amazing
things like you and me.
So, why don't we do it on the
computer?
The fourth one is predicated
on the observation
that all knowledge that
is learned from data
is necessarily uncertain.
We're never sure.
And this is one of the big
barriers.
So, what we should do is
quantify the uncertainty
and then systematically reduce
it.
The more we reduce it, the more
we know.
And, finally, there's this
approach
that's very common in human
beings
and how they solve problems
is to notice similarities
between new situations and old
ones
and then, by analogy, try to
infer
what to do in any
situation from the old one.
And, associated with
each of these approaches,
there's a whole school of
thought.
There's a whole paradigm
in machine learning.
And what I'm gonna do here is
give you
just the highlights of what each
one does
and some of the examples of the
things
that have been done with it.
So, the first one is
associated with the symbolists.
It has its origins in
logic and philosophy.
One of the interesting things
about machine learning,
or at least for my taste,
is that every paradigm has its
roots
in a different field of science.
So, under the guise of
studying machine learning,
you can study all sorts of
things and never be bored.
Each of these schools also
has its own master algorithm.
An algorithm that you can use,
in principle, to learn
any knowledge from data.
The master algorithm of the
symbolists
is what is called inverse
deduction.
And we'll see in just a second
what it is.
For each of these algorithms,
there's a theorem that says,
if you give it enough data,
it will learn anything.
Of course, whether you can learn
with a realistic amount of
computing is a different matter.
But at least those theorems are
there.
Then there are the
connectionists.
These are the people I mentioned
who are obviously
influenced by neuroscience.
Their master algorithm is
backpropagation.
You all have in your pocket
right now
powering the speech
recognition and other things
in your cell phone.
The evolutionaries are the
people who are
influenced by, obviously,
evolutionary biology.
Their master algorithm is
something called genetic
programming.
Or, more generally, genetic
algorithms.
Algorithms inspired by genetics.
The Bayesians come from
statistics.
They're the people who
worry about uncertainty.
And their approach is
probabilistic inference
based on Bayes' theorem,
which is where the name comes
from.
And then, finally, the
analogizers.
The people who see learning and
reasoning
as doing analogies and playing
them out.
They actually have a
diverse set of influences.
Probably the most important
one is from psychology,
because, again, there's a
lot of experimental evidence
that humans do this.
And the most important
algorithm in this class
is what I call kernel machines
or support factor machines.
So, let's start with the
symbolists.
Here are three of the most
prominent
symbolists in the world.
Tom Mitchell at Carnegie Mellon,
Steve Muggleton in the UK,
and Russ Quinlan in Australia.
And the basic idea
behind symbolic learning
is actually a really brilliant
insight.
This is the following.
What is induction?
Machine learning is induction.
It's inducing general
rules from specific facts.
And this is a very hard thing to
do.
What we do, however, know very
well
how to do is the opposite.
It's deduction.
It's going from generals
to specific facts.
In the history of mathematics,
people have, over and over
again,
succeeded by taking something
they didn't understand
and viewing it as the inverse
operation
of something they did
understand.
Like, for example, we can
understand subtraction
as the opposite of addition.
Or integration as the
inverse of differentiation.
So, the idea here is let
us do the same thing,
but with induction and deduction
as the inverse operations.
So, for example, addition
lets you answer questions
like, if I add two and two, what
do I get?
The answer is four.
That's not the deepest
thing you'll hear today.
Subtraction gives us the
answer to the inverse question.
Which is what do I need to add
to two
in order to get four?
And the answer, of course, is
two.
Similarly, deduction lets
you answer questions like,
if I know that Socrates is human
and that humans are mortal,
what can I infer from that?
Well, I can infer that Socrates
is mortal.
Now, that's deduction.
That's easy.
Induction is the opposite.
It's saying, if I know
that Socrates is human,
what else do I need to know
in order to infer that he is
mortal?
And the answer, of course, is,
if I know that humans are
mortal,
then I can infer that Socrates
is mortal.
And now I've just added a new
rule to my knowledge base.
And, in the future, I can
combine that rule chain
to answer questions to
inferences
in situations that are
potentially very different
from the situation in
which I learned this.
And this ability to learn
composable knowledge
is something that only
the symbolists have.
It's one of the most
important things they have.
Now, of course, this is
only in natural language.
And computers don't
understand natural language.
They do this in a formal
language.
Like, for example, first order
logic.
But the basic idea is the same.
And, as I mentioned, this is a
little bit
like scientists at work.
Formulating hypothesis
to explain what they see,
testing them, refining them.
And, in fact, one of the most
exciting
and eye opening, I would say,
applications of symbolic
learning to date
is precisely to automate
science.
So, for example, the
biologist in this picture
is not the guy.
Give me a second here.
The biologist in this
picture is not this guy.
That's a mission learning
researcher
by the name of Ross King.
The biologist in this picture
is the machine in the
background.
Ross is at the University of
Manchester
and he has built a series of
machines.
The first one was called Adam
and the current one is called
Eve.
That are complete scientists in
a box.
You know, very imaginative.
They're a complete scientist in
a box.
It starts out with some basic
knowledge
of molecular biology,
DNA, proteins, and so on.
And then it's given, for
example,
a model organism to study.
Like, say, yeast.
And then it starts formulating
hypotheses
using the inverse deduction
process that I just described.
And it actually designs and
carries out all by itself
the experiments to test this
hypothesis.
So, what you have there are
things like
microarrays and gene
sequences and whatnot.
And then, based on that,
it refines the hypothesis
and keeps going.
And, recently, Eve discovered
the new malaria drug
that is now being tested.
And, once you have one
robot scientist like this,
there's nothing keeping
you from making millions.
Imagine you're a biologist.
Instead of having 10 post-docs,
you can have 10 million.
And make progress a million
times faster.
And they don't get grumpy.
They don't need to sleep.
They have no rights.
(laughter)
It's amazing.
Now, the connectionists
look at all this and say,
"Well, yeah, but most learning
is not scientists in a lab coat
or mathematicians doing
deductions with pencil and
paper.
Most learning is done
by people in real life."
And, of course, the learning
engine that
powers all of this is your
brain.
So, if only we could
understand how the brain works,
then everything else would
follow.
And these are the
connectionists.
So called because all the
knowledge
is in the connections
between your neurons.
And here are the three leading
connectionists in the world.
In fact, just last week,
they won the Turing Award.
The Nobel Prize of computer
science.
Highly deserved.
The leader of that whole
school is Geoff Hinton.
And he's actually been
doing this since 1970.
Since he was a grad student.
Against the wishes of his
advisor.
And he really believes that
there's a single algorithm
by which the brain learns and
he's going to discover it.
And he's been at it for 40
years.
In fact, he tells the story of
coming home from work one day
saying, "Yes, I did it.
I figured out how the brain
works."
And his daughter looked
up at him and said,
"Oh, dad, not again."
(laughter)
So, he's persistent.
And his quest has had its ups
and downs.
But it's really starting to pay
off.
In particular, he's one of
the inventors of backprop,
which is, you know, this
algorithm that powers
just about everything these
days.
And two other very important
ones
are Yann LeCun and Yoshua
Bengio.
Yann who, of course, is right
here in NYU.
So, how does connectionism work?
Well, let's follow this process
of reverse engineering the
competition.
Do we understand the
competition?
Roughly.
We know roughly how a neuron
works.
So, it can try to build a
computer model of how it works.
And then assemble a network.
Because the brain is a network
of neurons.
And try to make it learn
in the same way, roughly,
that the brain works.
So, how does a neuron work?
What do we need to extract from
it?
Well, a neuron is a really,
really weird kind of cell.
It's a cell that really looks
like a microscopic tree.
There's a trunk called the axon.
This is the axon.
And then it branches out.
The branches are called the
dendrites.
And then the neuron discharges.
Sends these electrical
discharges down the axon
and out through the dendrites.
And then the outgoing
dendrites of one neuron
make contact with the roots
of other neurons that,
confusingly, are also called
dendrites.
And the place where they
meet is called a synapse.
And these synapses can
be more or less efficient
at passing the current.
And, in particular, when two
neurons fire at the same time,
the synapse becomes more
efficient at firing.
Meaning it becomes easier
for the upstream neuron
to make the downstream neuron
fire.
And, to the best of our
knowledge,
everything you know,
everything you ever learned,
is contained in the
strength of your synapses.
In the connections between your
neurons.
And, hence, the name
connectionism.
So, here's a very simple model
of this.
A neuron, what it does is
it takes a bunch of inputs.
They could be other neurons.
They could be the pixels in
the camera, in the retina.
It multiplies each one by a
weight.
That's how efficient the synapse
is.
And then, if that sum exceeds a
threshold,
then it fires.
Otherwise, it doesn't fire.
So, for example, if I put
a picture of a cat here,
it should fire, because it is a
cat.
The output should be one.
If it's not firing, then the
neuron is doing something wrong
and I need to fix it.
And Frank Rosenblatt in the '50s
figured out how to do this for
one neuron.
What people couldn't
figure out for many years
was how to do this when you
have a whole network of neurons.
Your brain is a network of
10 to 100 billion neurons.
And each one of them has a
thousand to 10,000 connections.
So, this is just on a different
scale.
Now, the solution to that
problem
was precisely the
backpropagation algorithm
that Geoff Hinton and others
came up with.
And backprop, at heart,
is a very simple idea.
Backprop, all it does is says,
"What I'm gonna do is.
So, the problem I'm trying to
solve is,
if my output is wrong, who do I
blame?"
This is often called
the credit assignment
problem in machine learning.
And backpropagation is the
solution to it.
Maybe it should be called
the blame assignment problem.
It's like who do we blame
when something goes wrong?
And it is like I'm going
to take each neuron
and each weight.
I'm gonna say, "If I tweak
this weight a little bit,
if I increase it,
does the error at my
output go up or down?"
And, if increasing the weight
makes the output error go down,
then I'll decrease the weight.
And what I'm gonna do in
order to make this efficient
is start at the output
and figure this thing out layer
by layer
based on what I've already
found for the downstream layers.
Which is why this is called
error backpropagation
or backprop for short.
Because it's propagating the
errors back
and learning based on that.
Now, as I mentioned, backprop is
used
for all sorts of things these
days.
But probably the most famous
application
is still the Google Cat Network.
This was actually on the front
page
of the New York Times a few
years ago.
And I never thought one day I'd
see
a machine learning algorithm
on the front page of the New
York Times.
But that's the first time it
happened.
The Google Cat Network was
trained
by looking at YouTube videos.
And, in fact, it was the
journalist John Markoff
that called it the Google Cat
Network.
It can recognize cats and
dogs and people and whatnot.
But it can recognize cats
better than anything else.
Which is why he called it
the Google Cat Network.
And the reason for that
is actually very simple
is that there's more video of
cats
than of anything else on YouTube
because people just love to
upload videos of their cats.
So, the network just sat there.
Watched hours and hours of
cat videos and other videos.
You could have called it maybe.
Maybe a better name would
be the Couch Potato Network.
But then, at the end of the day,
it did this amazing thing.
Which is it could actually
recognize
things like cats and dogs.
Which may sound like a
very easy problem to us,
but is extremely hard
for a computer to solve.
And now the results of
this are everywhere.
Now, the evolutionaries,
they look and this and go,
"Well, sure, that's nice that
you can tweak the weights on a
brain
and make it learn that way.
But where did the brain itself
come from?"
In deep learning, we actually
have to design by hand
the architecture of the network.
In fact, we spend a
lot of time doing that.
But, in the case of our brain,
the brain was designed by
evolution.
And we roughly know how
evolution works.
So, why don't we simulate
that on a computer?
Except that instead of
evolving plants and animals,
we're going to evolve programs
and electronic circuits
and so on.
So, John Holland was the guy who
really pushed this idea in the
beginning.
John Koza.
For many years.
John Koza and Hod Lipson are
two more recent people in this
area.
So, how does this work?
Well, you know, John called
it genetic algorithms,
because the algorithm's
inspired by genetics.
And it is like this.
At any point in time, you have
a population of individuals.
Which, in the case of nature,
each one of them is defined by a
genome,
which is a sequence of DNA base
pairs.
For computers, they can
just be bit strings.
'Cause it doesn't matter
how many values you have.
And then each of those defines a
program.
And then what we do is going to
execute that program on the
task.
And the ones that do better
will receive a higher fitness
score.
Again, by analogy with biology.
And then the ones with
the highest fitness scores
get to reproduce.
You literally take two of these
programs and you mate them.
You create a new genome that
is a crossover of the father
genome and the mother genome.
And that's a child.
And then a few more random
mutations.
You do a bunch of children like
this
and you have a new generation.
And the thing that's amazing is,
if you do this starting with
completely random strings,
after a few thousands
generations,
this actually is doing
things that, in many cases,
human engineers can't.
The evolutionaries have actually
gotten a bunch of patents
from the US Patent Office
for things like radios and
amplifiers
invented by genetic algorithms.
Nobody understands exactly how
they work,
because they don't work
according to our principles.
But they actually do their job
better.
And, these days.
Let me skip over this.
These days, the evolutionary's
actually having a lot of fun
creating no longer just
circuits or programs,
but actual hardware physical
robots.
So, this little spider
is from Hod Lipson's lab.
And it runs around and runs
away from you and so on.
It's exciting and maybe
also a little scary.
So, if Terminator ever comes to
pass,
maybe this is how it begin.
Of course, these spiders are
nowhere near
ready to take over the world.
But they started out.
They've come a long way from
the random soup of parts
that they started out as.
In fact, the way to do this is
they start
by doing it in a simulation.
Once the robots have evolved to
a point
where we can actually make them,
then they get 3D printed.
And then, in each generation,
the robots that do best at
whatever task
or measure we're trying to
optimize
get to program the 3D printer
to produce the next generation.
So, this is the state of the
art in evolutionary learning.
Now, the connectionists
and the evolutionaries,
for all the quarrels
they have between them,
have something very important in
common.
Which is they do learning
that is inspired by biology.
Whether it's the brain or
evolution.
Most machine learning
researchers
actually do not think
this is such a great idea.
Because biology is very random.
Who knows that it produce
the optimal result?
What we should do is figure out,
from first principles, what
is the optimal thing to do
and then implement those on the
computer.
And the poster children of this
approach
to machine learning are the
Bayesians.
And, of course, the fundamental
principle
from which all learning derives
for them is Bayes' theorem.
For a Bayesian,
if your algorithm is not
comparable with Bayes' theorem,
it must be wrong.
So, the Bayesians are the most
die hard
of all the machine learning
schools.
And they say so themselves.
Part of this is because,
for 200 years in statistics,
they were a persecuted minority.
So, they had to get very
religious about it to survive.
And they did.
Which is a good thing,
because, these days,
on the back of computer power
and so on,
it's really on the ascendant.
Not just in machine learning.
But in statistics, as well.
So, three famous Bayesians.
David Heckerman, Michael Jordan.
Judea Pearl actually won the
Turing Award a few years ago
for inventing something
called Bayesian networks,
which is a very powerful
representation
for Bayesian learning.
So, what is Bayesian learning?
Well, as I said, the
cornerstone of Bayesian learning
is Bayes' theorem.
And Bayesians love it so much
that,
a few years ago, there was
a Bayesian learning startup
in London, I think, that
actually created
a big neon sign of Bayes'
theorem
and hung it outside their
offices
for the whole city to see.
So, what is Bayes' theorem,
for those of you who don't know
it?
How does machine learning
based on Bayes' theorem work?
Well, the idea is the following.
I have a set of hypotheses
that I could use to explain the
world.
And I don't know which one is
true.
What I have is some probability
that each hypothesis is true.
And I start out with what is
called
the prior probability of the
hypothesis,
which is how much I believe in
it
before I even see any data.
So, this is prior, because
it's prior to seeing evidence.
And then I start seeing data.
And the hypothesis that are
compatible with the data
become more probable
and the ones that aren't
become less probable.
So, the probability of a
hypothesis given
is called its likelihood.
And, in fact, this is also what,
frequently, statisticians use.
And so, hopefully,
what happens after I've
seen a bunch of that,
it starts to become clear
which are the good hypotheses
and not so good ones.
And the product of the
likelihood
and the prior possibilities
is called the posterior
probability,
which is how much I
believe in a hypothesis
after seeing the evidence.
And then I have to divide by
this thing
called the marginal to make sure
the probability adds up to one.
But that's not that
important for our purposes.
So, this is a very sound way
to figure out how much I
believe each hypothesis
according to evidence.
With powerful hypothesis
classes, like Bayesian networks,
this is extremely
computationally difficult.
And a lot of the smarts of the
Bayesians
have been in coming up with ways
to make that computation viable.
Of course, Moore's Law
has also helped a lot.
And Bayesian learning has
been used for a lot of things.
In particular, your first
self-driving car
will probably have a Bayesian
network
helping the car figure out
what the world is that it's in
and where it is on that map and
so on.
But one application of Bayesian
learning
that we are all familiar
with and grateful for
is spam filtering.
This was actually David
Heckerman's idea.
And the idea is very simple.
I just have two hypotheses.
I'm looking at an email.
One hypothesis is that this
email is spam.
And the other hypothesis
is that it's so called ham.
Or it's a good email.
And now the evidence is
the contents of the email.
So, for example, if the email
contains
the word free in all capitals,
that makes it more likely to be
spam.
If that word followed by the
word Viagra,
that makes it way more likely to
be spam.
This is a real example.
On the other hand, if it
contains
your best friend's name
on the signature line,
that makes it more likely to be
ham.
And this works surprisingly
well.
These days, there's spam filters
based on
all sorts of machine learning
ideas.
But this was the first one
and is still one of the most
widely used and most effective.
So, finally, the analogizers
have
a very different approach to all
of this.
And this is the idea that.
In fact, in my experience,
this is the approach
that most non-machine experts
find the most intuitive.
Perhaps it's because it's what
we do.
And the idea is, when we
have a problem to solve,
what we do is we find in our
memories a similar problem
in our past experience.
And then we transfer the
solution
from the case that we remember
to the case that we need to
solve now.
And the simplest algorithm of
this type
is something called the
nearest neighbor algorithm.
We'll see in just a second what
that is.
And one of the main people
responsible for establishing
nearest neighbor
as an important algorithm was
Peter Hart.
Vladimir Vapnik created kernel
machines,
which are the most
sophisticated algorithm.
Or at least of the widely
used algorithms in this class.
Another famous analogizer
is Douglas Hofstadter.
Whom you may know as the
author of Godel, Escher, Bach.
And, in fact, he coined
the term analogizer.
And his most recent book is 500
pages
arguing that everything in
cognition,
learning, reasoning, you name
it,
is all just analogy in action.
So, he really does believe that
analogy is the master algorithm.
So, how does nearest neighbor
work?
Let me introduce that to you
by way of a simple puzzle.
So, here's the puzzle.
A simple exercise.
I give you the map of two
countries.
One is called Posistan and
one is called Negaland.
The plus signs are the
main cities in Posistan
and the minus signs are the
main cities in Negaland.
And that's all I give you.
And now what I ask you is
where is the border between
Posistan and Negaland?
Now, of course, we don't know
exactly,
because the cities don't
determine the border.
But, if you just look at
those plus and minus signs,
you can probably roughly
guess where the border lies.
And nearest neighbor is
the exactly the heuristic
for solving this problem.
And the heuristic technique
nearest neighbor follows
is that a point will be in
Posistan
if it's closer to a plus sign,
to a positive city, than
to any negative one.
So, for example, that line right
there
is the set of points that
are at the same distance
between that plus and that
minus.
Very simple but incredibly
effective.
In fact, what Peter Hart
did back in the '60s
was prove that, despite the
simplicity of this algorithm,
you could learn anything
with it given enough data.
So, you could actually say that
this was,
in some sense, the first real
machine learning algorithm.
The first algorithm that could
learn without limit as
you give it more data.
Prior to that, there were
only statistical algorithms,
like various kinds of linear
classifiers
that could only learn so much.
So, nearest neighbor is really
the first
full-blown machine learning
algorithm.
And it's very successful even
for things
like medical diagnosis, for
example.
You need to diagnose a new
patient.
You know nothing about medicine.
If you have a database of
patient records,
you can just find the patient
with the most similar symptoms
and assume that the diagnosis is
the same.
As dumb as this is,
you give it a database
of a few hundred cases
and it will probably beat human
doctors
at diagnosis of that problem.
Even after all the years in med
school.
Makes you wanna cry if you're a
doctor.
But, hey, I hope none of you
are doctors or a patient.
I mean, for patients, it's good.
Because now medicine can be much
cheaper
and continuously available
on your smartphone and so on.
You know, nearest neighbor
has some shortcomings
that are solved by kernel
machines.
But let me skip over that.
It's a very widely used type of
learning.
And has been for decades.
Probably the most famous
and economically important
application
of this type of learning
is recommender systems.
And the idea behind recommender
system is the following.
Suppose I'm Netflix and I want
to recommend a movie to you.
In the beginning, people
were trying to do it
using the characteristics of the
movie.
What genre?
Do you like action or do you
like drama?
The actors.
The directors.
But taste is a subtle thing and
that doesn't work very well.
The important insight
was when people said,
"Now, what I'm gonna do
is I'm gonna find people
who have similar tastes to
yours.
I'm gonna find your nearest
neighbors in taste space.
And then, if they like the
movie that you haven't seen,
then I'll recommend that to you,
because, since your tastes are
similar,
you'll probably like it, as
well."
How do I decide how similar
people are?
Well, that's why you give them
all those star ratings for.
If I tend to give five stars to
a movie that you give five
stars to and vice versa,
then we have similar tastes.
And your nearest neighbors in
taste space
could be in China or New
Zealand.
It doesn't matter.
If they tend to like or
dislike the same things,
then I can make predictions
based on that.
Recommender systems are
reportedly about a third
of Amazon's entire business.
Meaning a third of the things
they sell
come from the recommendations.
And, for Netflix, it's 3/4.
So, 3/4 of Netflix's business
comes from the recommender
system.
And, of course, every E-commerce
website worth its salt.
Not to mention the likes
of Spotify and Pandora.
Et cetera, et cetera.
They all use this.
Okay, to summarize,
we met the five main
tribes of machine learning.
We've seen that each one of them
has a problem that it solves
better than all the others.
For the symbolists, it's
knowledge composition.
For the connectionists,
it's create assignment.
For the evolutionaries,
it's discovering structure.
For Bayesians, it's handling
uncertainty.
And, for analogists,
it's using similarity.
And we're also seen that each
one of them
has a master algorithm.
An algorithm that, in principle,
is capable of learning
any knowledge from data.
And the more optimistic members
of each of these tribes,
probably most prominently
the connectionists,
they think that they have it
all.
You're gonna learn
everything using backprop.
But most people in the
field do not believe that.
And the reason is simple.
Is that these five problems are
all real.
And each of these algorithms
only solves one of them.
We're not gonna have a
true master algorithm
until we have an algorithm that
solves all of those five
problems at the same time.
So, the question becomes can we
unify
all these algorithms into a
single true master algorithm
that really does solve
all those five problems?
And, at first, this seems
like a really hard problem.
In fact, some people used
to say it was impossible.
Because, superficially, these
algorithms look so different.
How could you possibly unify
them?
Well, in reality, once you
notice that
they are all made of the
same three components,
then it becomes much easier.
Because, if you unify those
components, then you're done.
So, what are those components?
Well, the first one is
representation.
Representation is the choice of
language
in which you represent
what you've learned.
So, in humans, it could
be English or Chinese
or another natural language.
For programming, it's things
like Java and Python and
whatnot.
In AI, we tend to use
more abstract languages,
like first order logic
or graphical models,
which include Bayesian
networks as a special case.
And, if we can somehow
combine those two into one,
first order logic and graphical
models,
then we actually have
representation
that covers all of the things
that we've talked about.
And, in fact, a number
of ways of doing that
have been developed.
The most successful ones are
what I call
probabilistic logics,
which, as the name implies,
have an element of probability
and an element of logic.
In particular, Markoff Logic
Networks,
which are probably the
most widely used one,
are just formulas in first order
logic
with weights attached.
If you really believe in a
formula,
then you give it a high weight.
If not so much, then you
give it a lower weight.
And then the state of the
world is more probable
if more formulas are true in it
and formulas with higher
weight are true in it.
So, this is the basic idea.
Now, the next question is,
if I have a representation,
now how do I decide what is a
good
versus a bad hypothesis
in that representation?
I need some kind of objective
function.
Some kind of scoring function
to decide what's good and what's
not.
And people here have
used all sorts of things.
But most of them are special
cases
of posterior possibilities
that Bayesians use.
So, in some sense, we already
have.
This is the easiest part of the
problem.
And we largely already
have a solution to it.
More generally, however,
the evaluation function
should actually not be
a part of the algorithm.
It should actually be something
that
is provided by the user.
If you're a company.
For example, this could be some
measure
of return on investment
or the click through rate
if you're in the ad business.
If you're a user, maybe
this should be some measure
of your happiness.
Like how much you liked the
movie or the song or whatever.
And then what the algorithm does
is it takes that measure
and finds the hypothesis
that optimizes it.
Which brings us to the last
problem, which is optimization.
It's finding, in that space,
the hypothesis that maximizes
the evaluation function.
And, here, it does a
natural combination of ideas
from the evolutionaries and
from the connectionists.
Remember, a formula in first
order logic
is just a tree of solved
formulas
combined by and and or or
not and implies and so on.
So, we can discover it
using genetic programming.
We have a genome that has
different possible formulas
and we evolve it that way.
But then, of course, our
formulas also have weights.
But, to learn those weights,
we can use backpropagation.
Backpropagation through
the chains of reasoning
by which we solve problems and
answer questions in the past.
And so, at this point,
we have something that
looks pretty close to being a
unification
of all the five master
algorithms.
Now, does that mean that we're
done?
Well, some people, again,
the more optimistic ones,
believe that we are close to
being done.
I actually don't think we're
anywhere close to being done.
And I think the problem,
or one main problem,
is that there are very important
ideas in machine learning
that haven't been discovered
by any of these schools yet.
And, in a way, it's harder
for us, as specialists,
to figure that out,
because we're already
thinking along the tracks
of a particular paradigm.
So, I actually have more hope
that
the answer will come from
people outside the field.
So, if you have any
ideas, please let me know.
So I can publish them.
(laughter)
So, to conclude, looking forward
to
when we have such a universal
learner,
what will it make possible
that is not possible today?
For example, we would all
like to have a home robot
that cooks dinner, does
the dishes, makes the beds,
maybe even looks after the
children.
Why don't we have that yet?
Well, first of all,
you can't do it without machine
learning.
I think there's universal
agreement about that.
But, second of all, a home
robot,
in the process of a normal day,
encounters every one
of those five problems.
And, therefore, it needs an
algorithm, a learning system,
that can handle all five.
And, until we have that,
we won't have home robots.
Here's another one.
Wouldn't it be nice, if,
when you go on the web,
instead of typing in some search
keywords
and seeing some pages
that maybe are relevant,
you could actually just ask
questions and get answers?
It would.
And, in fact, Google,
Microsoft, Facebook, et cetera,
they're all trying to solve this
problem.
Turn the web into a big
knowledge base
that you can then just ask
questions to.
But, of course, in order to do
that,
first of all, you need a
very powerful representation
at the level of first order
logic.
Otherwise, it's not gonna be
very good.
But then the web is full
of contradictory knowledge
and noise and ambiguities and
whatnot.
So, you need probability
for that, as well.
So, again, until we've
unified those five algorithms,
they're not gonna be able
to solve this problem.
Here's another one.
Perhaps the most important one.
Curing cancer.
I've already mentioned that
machine learning algorithms
are very good at medical
diagnosis.
In fact, typically better
than the human doctors
at any one medical diagnosis
problem
if they have the right data to
train on.
But we certainly have not cured
cancer
using machine learning.
Why is that?
The reason is that cancer
is not a single disease.
And, therefore, there will
never be a single drug.
Or very unlikely that it
will ever be a single drug
that cures cancer.
Every patient's tumor is
different.
And so, everyone needs a
different cure.
And the solution, researchers
increasingly believe,
is the machine learning system
that
takes in the patient's genome,
the cancer's mutations,
the patient's medical history,
and then predicts, for that
patient,
what is the drug that is
going to cure their cancer.
But, again, in order to do that,
you need to understand how
cells work at a very fine level.
And there's an enormous amount
of research going on in this
using a lot of data from things
like
microarrays and sequences and
whatnot.
And a lot of very advanced
machine learning to do this.
But we're not there yet.
And, again, we won't be able to
do this
until we solve all of those five
problems.
Because they're all very
much present here, as well.
And, finally, going back
to recommender systems,
recommender systems are part
of everybody's life these days.
You continue using recommender
systems
even when you might not realize
it.
You have, for example,
Facebook is choosing
what posts to show you.
Twitter is basically a big
machine learning algorithm
figuring out what tweets
to show to whom and so on.
Amazon, Netflix, et cetera, et
cetera.
But each of these recommender
systems
is not that great yet.
And one big reason for that is
that
it only knows a sliver of you.
So, Netflix knows your tastes in
movies.
But that's it.
And, you know, Spotify
knows your taste in music.
But that's it.
What I would really like
to have, as a consumer,
is a recommender system that
learns from all the data
that I ever generate
and then has a 360 degree
picture of me.
It knows me really, really well.
It knows me better than my best
friend.
And then, based on that,
it can recommend things
for me at every step of life.
Small things and big things.
From what to eat today,
what restaurant to pick,
to where to go for college,
where to move.
These days, even things like.
A third of all marriages in
America today
start on the Internet.
And the matchmakers are
machine learning algorithms.
We used to have village
matchmakers.
But now the village matchmaker's
a machine learning algorithm.
So, there are children alive
today
who would not have been born
if not for machine learning.
But, if you ask the parents
how did you get paired,
it was the algorithm.
This is the world that we live
in today.
And, you know, the truth is,
for all that the CEOs
of those companies say,
their matching algorithms
are not very good.
Largely because they don't know.
You fill in a form.
And some already match
people based on that.
That's not gonna work.
If you're gonna entrust
such an important
decision to an algorithm,
it better know, for example,
your taste in music.
Maybe people with taste in
music are more compatible.
Where you've traveled to.
Your entire life is potentially
relevant.
But then, in order to do that,
and supposing that you've
pooled all that data
and you've dealt with all the
privacy
and the security problems,
which is a whole nest of
worms in its own right,
then you still need an algorithm
that
can actually combine all of that
data
into an integrated picture of
you.
And all the big tech companies
are trying to do this.
You see them in the virtual
assistants.
Like Siri's trying to do this.
Cortana's trying to do this.
Alexa, Google now.
But we don't quite have the
learning
that is able to do that yet.
But, if we do manage to
create the master algorithm,
then we will.
And I think we will all have
happier
and more productive lives for
that.
Thank you.
(applause)
- Thank you so much, Pedro.
We thought we'd spend
a little bit of time.
We were gonna put, actually,
a picture of fire up there
for our fireside chat, but we
didn't.
We can put up the picture of the
series.
So, in the spirit of
demystifying.
By the way, thanks.
I thought the talk was great.
I don't know about you guys.
But who cares?
(laughter)
I'm just kidding.
So, when you're dealing
with people out in the world
who really need to know more
than they do
about machine learning and AI,
what do you think is the key.
If you could fix one thing
that people didn't understand,
what would that thing be?
About AI and machine learning
or whatever you want.
- Great question.
And, in fact, I think there
is a very clear candidate
for the single biggest and
most harmful misconception
that people have about AI.
And it's very simple.
People confuse artificial
intelligence
with human intelligence.
We are always projecting
onto AIs human qualities
like free will and
consciousness and emotions
and the will to power
and a whole range of things
that AIs just don't have.
This is natural.
Because human beings, as I
was saying, reason by analogy.
And we're faced with a new
phenomena.
We always try to deduce it
to the ones we already know.
And human or animal intelligence
is the only one that
we know on the planet.
But, as a result of which,
we tend to treat, for better and
worse,
AIs as if they're kind of like
humans.
Which leads us into a lot of
areas about,
first of all, what they can and
can't do,
about what the dangers
and opportunities are.
And, in a way, if there's
only one thing that
I hope people will come
away from a talk like this
or from reading The Master
Algorithm
is a much better picture
of what AI really is.
It's not a human being in
disguise.
There's no homunculus
inside that computer system.
Intelligence is just the
ability to solve problems
and learn to solve problems
using the kind of techniques
that we discovered.
And, in some sense, there's
no black magic there.
Those of us in the field
know there's no black magic.
Even though it looks like
black magic from the outside.
- So, I had joked at the very
outset
aren't they already smarter than
we are?
And so, there certainly are
ways in which the AI systems
are far smarter.
Dimensions along which they're
far smarter than we are.
And dimensions along which
they're essentially stupid.
- Yeah, in fact, there's
a dimension along which
the machines are the farthest
ahead of us.
It's in doing arithmetic.
We don't even give them credit
for that.
But a machine can beat
a human at arithmetic
by a factor of a billion to one.
Or a trillion.
And, of course, this is uneven
race.
But, remember, computers started
out as a job description.
They were people whose job was
to compute.
I mean, it was considered
something that
required a certain level of
education and intelligence
to just do the math correctly.
And now we don't even pay
attention to that anymore.
And so, when people say,
"Oh, when will machines
be smarter than people?"
Well, in some things,
they're already way smarter.
In some things, they are now
becoming smarter than people.
And, in some things,
they are very, very far
from being as smart as people.
- Yeah, so, we can start to
think about.
I'm gonna ask some questions
about jobs.
We can start to think about
what types of problems
you might expect the
computers to do really well.
I mean, some of the earliest
applications
of machine learning and
AI to business problems
were things like credit scoring.
Where it's very important, of
course,
to estimate the probability of
default.
And people are absolutely
miserable at
estimating the probability at
anything.
But, once you have lots and lots
of data,
it's just arithmetic.
And so, those would be areas
where
it wasn't just that you
could replace people
because the machine
could automate them away.
Decades and decades ago,
the machine would just blow
people away
at doing things like estimating
what's the likelihood
of somebody defaulting.
- Here's something that
people in the field of AI
in the beginning got very wrong.
And, in fact, a lot of
people today still get wrong,
because it's a very
natural mistake to make.
But now we really know better
than that.
And this was the notion,
kind of very intuitive maybe,
that the first jobs to be
automated
will be the blue collar ones.
Because anybody can do those
jobs.
Whereas, you know, doctor
or lawyer, tax accountant,
those take college degree.
Actually, it's more like the
opposite.
It's much easier to
automate the job of a doctor
or of a financial advisor
than it is to automate the
job of a construction worker.
We are very far from having a
robot
that can walk around a
construction site
not tripping over itself.
Let alone actually doing
something useful.
The reason this is
counterintuitive is that
evolution spent 500
million years evolving us
to be competent in the physical
world.
And we are really good at it.
And we take it completely for
granted.
Whereas the things you
have to got to college for,
those are the things that
evolution did not spend
any time evolving you for.
And, therefore, we're
really beginners at that.
And so, computers can blow past
us in that
very, very quickly.
So, the real division is not so
much
blue collar versus white collar
as is it a routine job
versus is it a job that
requires a lot of flexibility?
If your job is something
fairly narrowly defined
and where there's a lot of data
for that,
machines can probably learn to
do it well.
As a rule of thumb, the more
of your brain your job uses,
the safer it is.
Because, if it requires a lot
of common sense knowledge,
a lot of integration of
information from different
sources,
a lot of combining abstract
thinking
with manipulating things
in the real world,
this is what is well beyond
the capability of computers.
And will probably continue to
be for the foreseeable future.
- And, in blue collar work,
it was really the work
that had been processized.
I mean, assembly line work and
so on
where somebody had laid it
out and made it systematic.
And so, you didn't have
to move around a lot.
And that's where the robots sort
of.
- Yeah, the more.
- Displaced the human.
- The more flexible your
environment is,
the harder for robots.
In fact, industrial robotics
is a classic example of this.
Industrial robots work as they
do
because the factory is a
highly controlled environment.
Whereas the home is a highly
uncontrolled environment.
Which is why it's very hard for
home bots.
But maybe an even better example
of that is transportation.
We already have robots that fly
planes.
In fact, human pilots.
Forgive me if you're a pilot.
But they're mostly superfluous
these days.
Because an airplane in the air
is actually a very simple
environment.
Driving a truck on the freeway
is actually much more
complicated.
But a freeway is still not
a very variable environment.
And it's where the frontier is.
These days, we will
have self-driving trucks
that drive maybe on the freeway,
not in the city, coming
online fairly soon.
But driving a car in the city,
like in Manhattan with
everybody honking at you
and pedestrians running in
front of you and whatnot,
that's a whole different degree
of chaos and variability.
And we're not there yet.
- So, let's talk about that.
'Cause that was another
thing I started out with.
- The Turing test of
self-driving cars
is driving in Manhattan.
(laughter)
- Right, so, when will we see
self-driving cars en masse?
Not little tests.
En masse on the streets of
Manhattan?
In our lifetime?
- They're already everywhere.
It's just that the drivers are
robots disguised as humans.
(laughter)
Well, it's April 1st, so.
(laughter)
Well, this is a very good
question.
So, let me give you three
answers to this.
The Waymo answer.
Waymo is the company that
is the leader in this.
They are very confident.
They say any day now.
Surprisingly confident, in my
opinion.
But, hey.
At the other end of the
spectrum,
people say not in our lifetimes.
Now, the reason they
say not in our lifetimes
is everything that we've
been talking about.
The reason Waymo is confident.
Actually, they're not
pulling it out of thin air.
Is the successes they've already
achieved
in a lot of this smaller tests
in retirement communities
and so on and so forth.
Exactly how long it's gonna
take.
There's a lot of wild cards
there.
You know, my personal
opinion is the following.
Driving a car in the
city is what is called
an AI complete problem.
Meaning, to solve that problem,
you need to solve every aspect
of AI.
It's not just the vision and the
control.
It's the tactical driving.
The thing that really
defeats self-driving cars
is interacting with humans.
It's like humans are
unpredictable.
And they will play games
of chicken with you.
And people being in self-driving
cars.
At first, they're really doubt
and scared.
But, 10 minutes later, they're
just bored.
Because a self-driving
car is like a grandma.
It drives very slowly, obeys all
the laws.
Nobody obeys all the traffic
laws.
But self-driving cars have to.
So, this is one aspect.
But the other thing to realize
is that the world will.
So, this is an argument for
it's gonna take a long time.
But an argument for it's gonna
happen sooner than you think
is that cities are not
gonna stay unchanged
while self-driving cars evolve
for them as they are now.
Cities will evolve to meet
self-driving cars in the middle.
Just like they evolved to
meet horse-led carriages
and trains and everything else.
We're gonna set up the
environment
in ways that make it easier
for the self-driving cars.
There's gonna be guides on the
streets.
Maybe cops will have.
They'll be talking in RFID.
Not themselves by voice.
But there's gonna be
all these ways in which
we make things more adapted
to self-driving cars.
So, it's an interesting
question.
I think it's not gonna
happen very, very soon.
But I think it will probably
happen in our lifetimes.
- With what?
I'm sorry.
- [Man] The rationalization of
the streets here in New York.
It's already easier to navigate.
- Easier to navigate.
So, those are one of the things.
I mean, let's go back to
the other humans issue.
I think that, so.
- By the way, sorry, let
me interdict something.
The whole problem is those pesky
humans.
- Yeah, it's the pesky humans.
- Back in the '90s, there was
this test
that was done in California
where they took a section
of the 5 Freeway.
That is the freeway that
runs all of California.
And they gave it over
to self-driving cars.
And those self-driving cars
actually have almost no AI.
They followed magnetic
guides were installed.
And they kept a certain constant
distance
between the car in front of
them.
And the test worked.
It all went fine.
Now, the problem is, once
you mix those with people,
all hell breaks loose.
- Yeah, so, I mean, when
the horseless carriage
replaced the horse carriage
and you were a pedestrian,
run for your life.
Nobody knew what they were
doing or run you over.
The self-driving cars are gonna
have to be
much safer than a human behind
the wheel.
And forget about the
fact that they could be.
They're gonna have to be.
Because, as soon as they plow
over some idiot pedestrian,
then everybody starts rethinking
whether they should be
on the streets or not.
But, of course, people are
people.
And, as you mentioned,
why don't we all out there just
walk out
in front of the cars?
This isn't Seattle.
That's where Pedro's been
living.
Where people stand on the corner
and wait for the light
to change to say walk.
- They'll actually wait
for you to cross the street
before you even get to
the pedestrian crossing.
Which makes no sense.
But, for self-driving cars, this
is good.
And there's also the guy,
like the alien in The
Hitchhiker's Guide to the
Galaxy,
who lands and walks into
the middle of the road
and waves to a car saying,
"Take me to your leader."
'Cause they think the
intelligent creatures
on Earth are cars.
And then, of course, the
poor alien gets run over.
- So, right, so, the reasons
that we don't
are some combination of
law-abidingness,
the fact that we really don't
wanna
get run over by a car,
and that we have some social
compact with the driver
that we're not gonna make
the driver slam on his brakes
and not to hit us.
But, if we know that that
car is not gonna run over us
because it can't
or else its company is
gonna go out of business
and they haven't gone
and basically made it
a capital crime to jaywalk
and the person sitting there
is reading the newspaper
in the backseat and it's
raining.
- The car, on the one hand,
doesn't want to run you over.
'Cause that would be bad.
On the other hand, for
them to be taken up,
it has to think about the
passenger.
And there have been all
these studies done by now.
Should the car swerve and kill
the driver
in order to save a poor
child standing on the street?
And people say yes.
But then they say, "What if it's
my car?"
Then they're kind of like no.
(laughter)
So, what's gonna happen?
This is my forecast is
there will have to be laws
that say the car has to
stop to save the people.
But then there'll be hacks.
You'll buy them in the
aftermarket.
You hot rod your car to not
stop.
'Cause people will still be
people.
With or without self-driving
cars.
- I sort of get that.
And so, it's another plug
for our session in two weeks.
There are all of these ethical
puzzles
that we've learned in
college in our ethics class.
And, back then, they were just
puzzles.
Do you push the fat person off
the bridge
to stop the trolley from
running some people over?
- Trolleyology, it's called now.
There's the whole field of
debate.
And it's called trolleyology.
- But guess what?
Once you're programming
this stuff into the car,
it's in there.
Whether you know it or
not is another question.
Like whether the machine learned
it.
But it's programmed in there.
- Here's something I think is
gonna be
very interesting about
this process is that.
Computers at the end of
the day are computers.
In the process of making
computers be ethical,
we're actually gonna have to
figure out
what we think about our ethics.
We can start, oh, we're gonna
give them
the three laws of robotics.
Popular notion.
But people forget that every
one of Isaac Asimov's stories
in that series was about how
they failed.
Those laws are too general, too
generic.
So, encoding the rules of
ethics is not gonna work.
You could say,
"Well, let's take the
machine learning approach
and have them learn to be
ethical by observing us."
But then they're gonna get very
confused.
(laughter)
Because we don't follow
our own rules of ethics.
Very far from that.
So, that's gonna force us to
confront
what are ethics really?
So, there's a lot that we've
been getting away so far
that we won't when we have to
program the machines to be
ethical.
- So, I think I agree with you.
What we'll probably see,
not so distant future,
is going to be robotic
semis on the freeway.
Because the freeway is already
off limits to pedestrians.
It already has much more
straightforward
kind of rules of the road.
And it's much easier to change
those rules of the road.
And so, you heard it here.
Buy up property around the
freeway exits near big cities.
Buy it now.
Because guess what?
That's where all the robot
trucks
are gonna have to stop.
And then everything's gonna have
to get
transferred over to the local
trucks.
Which are gonna need
people anyway right now
to unload and load the trucks
wherever they're going.
- I mean, I don't know.
This is an interesting idea.
I don't think they will get
transferred.
But I think this mode of the
truck is driven by a human
from its starting point until
the start of the freeway.
Which is only 30 minutes.
And then, the next many
hours on the freeway,
there's a robot driving the
truck.
And then somebody hops on at the
end.
This, I think, is a very
good example of how.
I think a lot of the
story in the coming years
is gonna be how the division of
labor
between humans and AIs happens.
And a lot of it is gonna be very
subtle.
And it's gonna be worked out
industry by industry, job by
job.
In fact, I think everybody
should be working out
what parts of.
The way to keep your job safe
is to automate it yourself.
And then you can move on to
doing the interesting parts.
And, naturally, do the whole
job better, faster, cheaper,
because you're now having the
AI doing the work for you.
- So, let's not think
about physical robotics.
Let's think about cognitive AI.
What jobs are ready to be
automated away?
Actual jobs.
A lot of my colleagues around
here
talk a lot about the future of
work
and how AI's gonna automate away
jobs.
And I'm having trouble seeing
what exactly are these jobs.
- I don't know why there's.
This is certainly a big meme
these days
is how AI is gonna be a jobs
apocalypse.
And the thing to think about is
we've had that worry every
decade for the last 200 years.
200 years ago, 98% of
Americans were farmers.
And then the McCormick
reaper did with one person
what took 50.
98% of us are not unemployed
now.
Any change in jobs that AI's
gonna cause
will be insignificant compared
to that.
What happens is that, today,
there's people doing all kinds
of jobs
that you couldn't even imagine
back then.
Like web app developer.
Who could even explain
that to someone circa 1850?
So, I think, number one, there
will be a lot of new jobs
that we don't even imagine.
But, number two, we can learn
from our past experience
that people thought that
ATMs were gonna cause
a huge surge of unemployment
in the banking sector.
Actually, the end result of that
was that
there were more bank tellers
than there were before.
The difference was that people
no longer
went to the bank teller to get
cash.
For that, they have the ATM.
The bank teller could do things
now
like give people advice,
upsell them on other products.
And so, there's actually
even more jobs for that than
ever before.
And tax accountants became tax
advisors.
Now the software actually
does the accounting.
And I think this is gonna be the
case
almost across the board.
That's not to say that
unemployment isn't going to
change.
But I think this notion that
AI will lead to mass destruction
of jobs.
It goes against.
I'm puzzled as to why so many.
Even Nobel laureate economists
are so worried about this.
Because what happens when
you make something cheaper?
You make the complements more
valuable.
And what AI does, at the end of
the day,
is make intelligence cheap.
Machine learning makes
prediction cheap.
There's actually an economist
that have studied this.
And so, the complement to that
are jobs that people can do that
maybe now are too expensive to
do
but now will become physical.
For example, recommender
systems.
If Amazon needed a human
recommender
figuring out what to
recommend to each one of us,
this just would not be possible.
But, because it is possible,
now there's all the other
Amazon jobs that exist
on top of that one that's
automated.
And I think the main story
of the next 10 or 20 years
is going to be this.
Is that, yes, there will
be a lot of automation.
But, no, that will not
cause mass unemployment.
- So, I'm gonna ask one more
question then maybe open it up
and see if anyone in the
audience has questions.
And that is, on the
other side of the coin,
are there things that
maybe people don't expect
are closer to happening because
of AI?
- Yeah, I'll give you a couple.
And I hope to not break your
heart.
But one of the things that I
keep hearing people say is,
"Yes, we can automate
intelligence.
But there are two things
that cannot be automated
and people will always be better
at.
Or will be better for a long
time.
Emotional, social relationships,
social intelligence,
and creativity."
This is a very common notion.
I would say misconception.
But let's just call it notion.
Now, unfortunately, or
fortunately.
I don't know.
The news is those are actually
easier.
Actually emotions.
We value them very highly.
Because that's what we're all
about.
And cognition exists
in service of emotion.
But emotion is much
simpler than cognition.
It's much easier to have a robot
fake
convincing emotional behavior
than to have it do something
intelligent.
Much easier.
So much easier that, in fact,
it's already the case today.
You know Tamagotchi?
Those little pets?
That's all they do is
fake that they're hungry
or that they're sad.
And then you're hooked.
No intelligence is required.
I'll give you an even better
example.
A contrast between two bots.
Cortana and Xiaoice.
Do people here know what
Cortana and Xiaoice are?
They are both bots from
Microsoft.
One of them is a huge success.
Another one is mostly a failure.
Which is which?
Cortana is a virtual assistant
that
is supposed to be competent
at answering your questions,
helping you schedule things.
Every time it messes up,
you get annoyed with it.
So, Cortana requires
intelligence
and the AI system we've
been talking about.
Not very successful so far.
Xiaoice is this amazing bot.
It's in China today.
It started out as an interface
to Bing.
It's a text-based interface.
But then they give it this
persona of a teenage girl.
And then they started
maximizing engagement,
which people are very
good at doing these days
with machine learning.
And so, what happens with
Cortana is that
sometimes she's kind of
commiserates with you.
Sometimes she gives you a hard
time.
And people get completely hooked
on her.
So, Xiaoice has tens of millions
of users in China today.
A quarter of them have told
her that they love her.
(laughter)
This is today's AI.
It's got all the emotion
you might want to ask for.
Another one is creativity.
People think creativity is this
magical thing that happens.
Foster and I have both
been involved in music
and people think composing a
song
is some piece of deep magic.
Actually, it's not.
And, in fact, there is, today,
already a very good music
composed by computer.
Very good paintings made by
computer.
Things in various fields
that anybody would call.
Some of the moves of
AlphaGo or even Deep Blue,
if they had been made by humans
would have been considered signs
of extraordinary creativity.
Let me just give you one
example.
David Cope is a professor, now
emeritus,
at UC Santa Cruz of music and
composer who, at one point.
And this was, I don't
know, 20 years ago or more.
Not even recent.
He got interested in how
well computers compose music.
So, he has a series of programs
that will compose music in the
style
of the composer of your choice.
So, you say, "I want Mozart."
And it will create a Mozart
piece for you
right there on the spot.
I actually saw him give a talk
at Tripoli.
I don't know if you were there.
Where he gave it a test.
"Play three pieces.
And I want you to tell me which
one is by the real Amadeus,
which one is by a human
composer imitating Mozart,
and which one is by my program."
And then people voted.
Now, the winner was still
Mozart.
That's good.
But, by a large margin, his
program beat the human composer.
And, honestly, if you
listened to that music,
whether it's in the style of
Mozart or Bach or Beethoven,
it's perfectly incredible music.
And, of course, these days
in the music industry,
a lot of the composition,
arrangement, tweaking things
is already done by machines.
And there's these companies
that will, for film.
You run the screenplay
through their NLP system.
Their language understanding
system.
It says, "Oh, I think, if
you change this character
from male to female,
the movie would gross
another $200 million.
And, if it happened in this city
and it had that plot twist."
And then they use it.
Of course, the creative types in
Hollywood
hate the guts of these systems.
Not just because they endanger
their jobs.
Because they're the opposite of
what
human creativity is all about.
But, on the other hand,
if you're a musician
and you're not inspired that
day,
maybe having such a system help
you
would be a great thing to have.
So, there you go.
- So, with the music
composition,
I remember that they wanted.
So, in AI systems, a lot of
times,
you wanna get experts to also
evaluate
how well you're doing.
And so, they had, I think it was
Bach,
and they were basically seeing
whether they could tell the
difference
between Bach and the program.
And they actually couldn't do
this.
Because the experts had
heard everything by Bach.
(laughter)
So, basically, if I
haven't heard it before,
then it must not be.
- What they should have said,
done in the spirit, "We've just
uncovered
this beautiful composition by
Bach."
- Alright, so, maybe we'll see
if anybody has any questions.
And I've been told that you
need to use the microphone,
because things are being
recorded.
- [Woman] Hi.
- [Man] Check, check.
- [Woman] Hi, thank you.
I just have two questions.
The first is you mentioned
a third of the matches
are made on the Internet.
I would ask then what
percentage of Internet matches
versus old fashioned end up in
divorce?
Can you do a study of that?
And the second question is.
- So, let me answer your
first question first.
So, it's not just a third of
matches.
A third of marriages
start on the Internet.
And the paper that first said
this
came out over 10 years ago now.
And, at first, people
thought it was a fluke.
But, number one, so there's
been a lot of followup.
As you can imagine.
It's been confirmed that this is
the case.
And, number two, it
leads to fewer divorces.
- [Woman] From the Internet?
- I mean, again, I think
this is all still evolving.
But the evidence so far is that,
actually, finding someone on the
Internet
is better than randomly
finding them in a bar
or through friends or whatnot.
Which maybe is not too
surprising, because.
Again, this, I think, is a good
example
of what AI does for you.
Of course, the algorithms
that are doing this
are still not very intelligent
at all.
On the other hand,
the universe of people
you can be matched with
is orders of magnitude
greater than it was before.
And, when you look at the
trade-off
between these two things,
the greater universe tends to
win.
Also because this is important
that
people are very patient.
They're willing.
I mean, there was this
beautiful cover story
in Wired a year or two ago
of this guy who's a hacker.
And he actually went and
downloaded
the entire contents of OKCupid.
He wasn't supposed to.
But figured out how to do that.
He's a hacker.
And then he went and
applied machine learning.
Clustered the women left and
right.
Did all of this.
And then he went on 100 dates.
Failure, failure, failure,
failure, failure, failure.
The 100th date, give or take,
he found the love of his
life and married her.
So, the story has a happy
ending.
One of the morals of the story.
People are willing to put
a lot of effort into this.
And, second of all, what he was
doing
is what everyone will be doing
without having to do the
machine learning themselves
in the near future.
So, the larger universe, that's
a given.
But the algorithms are getting
better.
And, again, the percentage
of people who meet online
is climbing higher and higher.
And, hopefully, the
results will be better,
because they are people who are
less likely to get divorced, for
example.
- [Woman] But, for those of
us who don't hack OKCupid?
- I mean, OKCupid.
He used that because
it was the easiest one
to get the data from.
They were kind of like more
of an independent company.
This is actually funny.
The guy who was one of
the co-founders of OKCupid
was actually part of this
band called Bishop Allen.
I don't know if you've
ever heard of Bishop Allen.
But then he wrote a
book about what he found
called Dataclysm.
And it's a great read,
because it's all about how
people say they like one thing
but then they like another.
And so on.
- [Woman] And the second
question.
Is Alexa of Amazon considered
AI?
- Good question.
So, Alexa is continually
evolving.
The first Alexa had.
So, the thing that drove
Alexa in the beginning
was good speech recognition,
which is AI
and very much an application
of machine learning.
And the fact that they could do
what is called far-field
recognition,
which is pick out somebody's
voice,
even when they're not right
on top of the microphone.
So, there was a lot of
machine learning involved
at that level.
But, here's the thing,
Alexa, if you've used it,
has no brain.
There is no intelligence there.
Again, a very good example
of what I was talking about.
We see Alexa interact with us
and we start to think there's
some kind of virtual assistant
there.
I'll give you a good example
that my son discovered.
Kids are often the people
who really get into Alexa.
One of the funner things
that Alexa can do for you
is ask you riddles.
So, ask you a riddle.
You give the answer.
She says no.
And then, finally, hopefully,
you get the answer.
Now, what happens if you ask
Alexa
one of her own riddles?
She doesn't understand the
question.
(laughter)
It's not just that Alexa
doesn't know the answer.
She doesn't even know what you
asked.
And the reason is that Alexa is
actually
several thousand so called
skills.
Each of which is
independent of the others.
So, Alexa has a skill where
it asks canned riddles
and then has some
intelligent, semi-intelligent
speech understanding to see
whether you answered it
correctly.
But that's it.
It relates in no way to anything
else that Alexa's doing.
So, right now, Alexa doesn't
have much of a brain.
But the folks at Amazon
understand that
this is an important thing to
have.
And there's very much an all out
race
between them and Google
and Apple and Microsoft.
So, Alexa, so far, is
not very intelligent.
But these assistants are
going to rapidly get more
intelligent.
I think one of the great test
cases
of how far and fast AI's
progressing.
One of them I already talked
about is self-driving cars.
But another great one
is virtual assistants.
In terms of the economic value
and how much these companies
are willing to plow,
it requires really
fundamental AI research.
It's not something you can just
engineer.
But the value of that is
enormous.
Because, if you create the first
truly smart virtual assistant,
you will be the lynch
pin of the world economy.
'Cause everybody will use it.
I think there will come a point
at which we won't know how we
ever lived
without virtual assistants.
If you already feel that way
about your smartphone today,
this is gonna be a different
order of magnitude.
And, conversely, everybody who
wants to sell you something
is gonna have to sell to
your virtual assistant first.
'Cause your virtual assistant
is the gatekeeper to you.
You know how drug companies
promote drugs to doctors
because they're the ones
who make the decisions?
Everybody's gonna have to go
through your virtual assistant.
It's Siri.
Like you can pick up
your iPhone, for example,
and say, "Siri, what
is the weather today?"
- The question is what will
be a virtual assistant?
I mean, example for me.
I get so much email
that, if I read my email,
I would do nothing else all day
and I still wouldn't read all my
email.
So, I have a great way to deal
with that.
I just ignore it.
And, when I finally realized
this.
Oh my god.
My life changed completely.
I was finally free of this
thing.
When I think back of it,
it's impossible for me to
actually do this thing.
So, why was I ever
actually worrying about it?
Basically, anybody can send me
stuff
with no cost whatsoever to them.
And, somehow or other, I should
be obliged
to then respond to it?
However, this leads to problems.
Like, for instance, if your
funding agent sends you an email
and, somehow or other,
you just ignored it.
And so, I need this assistant to
be able
to actually solve that task.
I'd do it my way.
But it would be better
if I actually had a virtual
assistant.
- So, virtual assistants,
to answer your question,
will start out doing really
simple things.
And then become more and more
competent.
Another example that
you may have encountered
is the Gmail autocomplete
function.
Where you start to type an email
and then it finishes it for you.
And you realize 90% of your
emails
really don't require
the use of your brain.
So, instead of the email
just being ignored,
it gets sent.
Oh, hey, let's blah blah blah.
And sounds very affectionate.
By the way, the guys who
did this project at Google.
They was afraid.
In the beginning, things
were not working very well,
because there was a particular
phrase
that it just put in every email.
You know what it was?
I love you.
(laughter)
I love you.
I love you.
I love you.
Always was a good solution.
And then they had to cut back on
that.
So, there you go.
- One more question.
Gentleman right here in the
middle
had been waiting for a while.
So, could we get the mic up
there?
- [Man] Based on your earlier
statement
of artificial intelligence and
being competitive advantage.
And, if you've reflected upon
social economic environments,
business, political system, data
sharing.
And this is just playing devil's
advocate.
Looking at smart cities,
Singapore is one of the most
advanced.
Looking at artificial
intelligence,
where China is gaining an
advantage.
Does it mean that command
economies
are better at artificial
intelligence
and will have a competitive
advantage
compared to free economies?
- This is a great question.
And I think our thinking of this
has evolved dramatically
in the last 10 years.
You know, mea culpa, right?
10 years ago, if you asked me.
AI, machine learning,
these are the greatest
tools ever for democracy.
The representatives can
become more responsive to you.
They can ask your model.
They can ask which policies do
you prefer?
What do you want to do in the
city today?
In fact, there's a lot
of great experiments
like this happening today.
So, I think AI has enormous
potential
to make democracies better.
Today, the number of bits per
year
that you communicate with your
representatives is like 100.
This is ridiculous.
This is 19th century.
All of this should change.
However, the progress of
making democracy better with AI
has been remarkably slow.
Partly because there are
a lot of entrenched interests
against it.
So, for example, I was talking
recently
with someone in the European
Parliament
who's one of the leaders
in the whole tech area.
And he was saying, "We should
have an agent like this."
Like Europa.
To ask questions about
governance.
And he said, "Yeah, this
would be really great
for people in European
Commission
and members of the European
parliament,
because they will be in better
contact
with their constituents."
And he said, "But they
don't necessarily want to be
in better contact with their
constituents.
They prefer to be able to ignore
them."
So, the challenges for
democracy.
On the other hand, on
the side of autocracy,
the thing that I think is now
very clear
that we were kind of
neglecting 10 or 20 years ago
is that AI is possibly the
greatest tool
ever invented for the autocrat.
AI is the ultimate surveillance
system.
The ultimate bureaucrat
that has no conscience.
Keeps tracks on everybody.
Doesn't get bored.
If you were an autocrat and you
want to oppress your people,
AI's for you.
No wonder Vladimir Putin is big
on it.
And Xi Jinping, too.
By the way, the way he was
found with this is interesting.
He gives a New Year's address
every year.
And the belief is that they
deliberately put some books
in the background on his
shelf to send a message
that this is what the leadership
thinks is important right now.
And, in China, this really
matters.
And they are going all out with
AI.
Largely for benign reasons.
And we should all be in favor of
those.
Both for the Chinese people
and for everybody else.
There are more people doing good
things.
But they're also going all out
building a surveillance state
based on AI.
With their social rating system,
which is very embryonic
but is really scary.
With their whole network
of surveillance cameras.
AI can be watching everybody all
the time.
And, in fact, there was this.
To just complete that thought.
I saw this very good interview.
I forget where it was.
But the journalist interviewed
one of the people who
were running the system.
And the journalist starts with,
"But this stuff of.
Can you recognize people's
faces as they cross the street?
This doesn't really work yet,
does it?"
And the Chinese official very
calmly said,
"No, but that doesn't matter.
All that matters is that
people believe it works
and it does its job."
It's like God.
God is watching everybody.
So, you better behave.
So, to answer your question,
I think we are going to find
out who AI works better for.
'Cause it works.
It helps everybody.
AI is a tool of immense power.
Every technology gives you
power.
AI, I think, is just the sheer
scale
and for that power that gives
you.
Democrats and autocrats will
both
try to make use of it.
We should not prejudge how this
will end.
I think that the way that.
I think and hope, but we
have to make that happen,
is that, in the end, democracy
will win.
But it won't be democracy
like we have today.
It will democracy transformed
to work better with AI.
And, in the meantime,
there is a race going on.
And we should face that fact.
Exactly, yeah.
I don't know if you guys saw
this recent book by Kai Fu Lee.
He says, here are the key
advantages China has in AI.
One of them is they have all the
data.
They're not worrying about
privacy.
They can experiment
with whoever they want.
And, in fact, I often
worry that we, in the West,
are in the process of
hamstringing our tech companies
because we're worried about
them becoming too far.
Which is a genuine worry.
But then they're gonna go up
against the Chinese tech
companies
and those have no such problems.
And they don't have the option
of not collaborating with the
government.
Here, the workers at Google say,
"No, we don't want you to take
part
in defense related projects.
We refuse to collaborate."
So, yeah, so autocrats.
If you look at history,
things always start out
looking better for the
autocrats.
If you go back to 100 years ago,
things were looking very
good for the autocrats.
Not very good for the
democracies.
Now what will happen?
You know, one dark thought is
that
maybe autocracy will prevail.
Not because it's better,
but because China's bigger.
Some people say that democracy
prevailed in the 20th century
not because it was a better
system.
But because America, who was
the most powerful country
with the biggest industry
happened to be democratic.
I don't buy that.
I think America was powerful
largely because of democracy
and private ownership and all
that stuff.
But we'll see.
- I'm gonna call an end to it.
Since we're over time already.
Thank you so much for coming.
(applause)
Thank you, everyone, for being
here.
A small token of our
appreciation.
- Thank you.
- Thank you, everyone.
See you in a couple weeks.
(applause)
