JEFFREY BROCK: Hello.
Welcome everyone.
I'm Jeff Brock, the director
of the Data Science Initiative.
It's a real pleasure to
see you all here tonight.
And it's also a pleasure
to see so many of you
at the campus launch earlier
today and the roundtables.
I'd like to thank all
of you who participated.
I think we took a
great step forward
in accomplishing our goal
of building a broad campus
network of collaboration.
We had roundtables
led by a broad group
from across the campus, the
Humanities Center of Robotics
Initiative, the Brown institute
for Brain Science, the physics
department, the
Joukowsky Institute,
a whole range of leaders.
And they emphasize the essential
interdisciplinary nature
of our effort here
in data science
and indeed, how the rise of data
science is transforming the way
we think about intellectual
inquiry across the academy.
And indeed, the Data Science
Initiative here grew out of--
I like the term, sort of a
transdisciplinary convergence.
We realize that computation,
mathematics, and statistics,
needed to be woven
together to address
the fundamental challenges of
data science, data curation
and collection, causal
inference, pattern
recognition, artificial
intelligence, just to name
a few.
Indeed, our core departments
of applied mathematics,
biostatistics, and
computer science
were already deeply
engaged in the data project
as it were and the
affiliated group, the Center
for Computational Molecular
Biology, which grew out
of those three departments,
became a core partner recently
and was a real triumph of this.
If anything, the
pure math department
was the Rip Van Winkle of this
team, waking up to the fact
that suddenly at
Silicon Valley people
were talking about
things like geometry
and topology and
harmonic analysis.
And we thought, well, we
better get to the table.
But deep data exploration
was not new to the campus.
It's been going on
all over the place,
whether it's CLIPS,
IBIS, physics, BCBI,
all those acronyms
that you hear,
we just needed an
organizing principle.
And so while our planning
chose to initially emphasize
the core foundations
represented so strongly here,
we knew our effort
needed to expand
outside of these boundaries
and engage the entire campus.
We met with many different
groups in the planning phase
and then spent some
time setting up shop.
With the one year master's
program just launched,
designed by our top faculty and
being taught by them, a new NSF
Tripods Institute grant
just funded in our home
on the ninth floor of
the science library,
buzzing with student
activity, I think,
we're at last
ready to go, hence,
the launch event you're
all here attending.
So the rise of data,
its prevalence,
it's analysis, it's power to
influence science and society
is all too evident
in our modern lives.
Our team at Brown felt we
should be playing the long game,
focusing on quality rather than
quick wins in the data war.
We thought Brown should leverage
these quantitative strengths
to build a foundationally rich
research enterprise in data
science along with new curricula
focused on methodologies,
domain applications,
and societal impacts,
which should be our signature.
But another important aspect
of data science at Brown
is the liberal arts
ethos and its emphasis
on helping students
reach their goal
to do good in the world
rather than merely well.
I organized a faculty
forum last May
with my colleagues, Ugur
Cetintemel, Joe Hogan,
and Sohini Ramachandran
entitled "Data
Science as a Liberal Art."
And it's easy to throw
that term around,
but I use the term carefully.
The liberal arts
involve education
that builds our capacity to
understand the world around us.
Indeed, mathematics and
basic science as well as
the social sciences and
the humanities and the arts
form the core of
the liberal arts.
We believe that Brown
should play a defining
role in providing this lens of
liberal arts and data science
and that there has been
such enthusiasm in engaging
with these data challenges
across the campus
gives us all the more evidence
that Brown is the right place
to do this.
And indeed, here
we sit in the nerve
center of the arts at Brown,
where next semester we'll
be celebrating a joint
hire between the Brown Arts
Initiative and the Data Science
Initiative, a deep scholar
and artist I've
known for many years.
So while I want to thank
also, the Arts Initiative
for hosting us here, I
want to give them a chance
to just make a
couple of remarks,
so I'm going to
welcome Chira DelSesto
to the podium for a minute.
CHIRA DELSESTO: Good
afternoon, everybody.
As Jeff said, I'm
Chira DelSesto.
I represent one of these
other acronyms here,
the BAI, the Brown
Arts Initiative.
We're very thrilled to have
you here in-- what is it,
a transdisciplinary convergence?
Is that what you called it?
So we've got another one again.
Super happy to have you here.
The Brown Arts Initiative is
another new initiative at Brown
we launched just in March.
And since that time we
have been reaching out,
not only to other
arts departments,
but to groups like DSI to
bring interdisciplinarity
back here to Granoff.
So as Jeff mentioned,
we are going to partner
on bringing Ali Momeni, who
is the data science and arts
professor of the practice,
who will join us in January.
And we're really looking
forward to having Ali based here
in the Granoff in studio working
with data science and art
students.
I wanted to bring my best wishes
from our director, Butch Rovan,
who really wanted
to be here tonight,
but is in the air at the moment.
But he asked me to convey
his best wishes to you
for a really successful
launch and a wonderful day.
Some people might wonder why
data science and creative arts
are able to work together.
We at Granoff like to
say that we are not just
a home for creative artists,
but for creative thinkers.
Many people might not think
that data scientists are
creative thinkers, but I think
we all know in this room,
you are among the most
creative thinkers.
So again, really welcome.
So happy to have you.
So happy to be working with you.
And good luck.
JEFFREY BROCK: Thanks, Chira.
Just one or two
more quick remarks.
I would be remiss if
I failed to mention
that as novel as the idea of
the data revolution appears,
Brown's legacy in this area
goes back many generations.
And the late Ulf
Grenander, former professor
of applied mathematics
here at Brown,
was the architect of the
subject of pattern theory.
By understanding complex
patterns and images
from a combinatorial,
statistical, and probabilistic
point of view, he
laid the foundation
for computer vision, medical
image analysis, and robotics.
And if only he could see it now,
with powerful new computational
systems and architectures
and statistically robust
algorithms, new methods
from deep learning
relate speech to text, kinetic
motion to brain activity,
climate data to
new climate models,
to say nothing of the
triumph of AlphaGo.
The tools are now
teaching the experts,
and yet the experts agree that
deep learning, as a theory,
is in its infancy.
Why does it work?
The models of the
previous century,
largely driven by calculus,
were great for building bridges
and ships and getting
us to the moon,
but we're asking new questions.
Can we design models
of intelligence?
How do they function?
What is their reliability?
Will our workforce
become obsolete?
And how can we ensure that
the data used to train them
or free of their own bias?
As Google, Amazon, Facebook,
and Twitter marshal data
at their scale and
formidable computing power,
how can we engage
these resources
in pursuit of open science
and transparent intellectual
inquiry?
So our goal is to create
a space for this dialogue.
I'd like to transition
quickly to the main event,
but first I'd like to
extend my gratitude
to the extraordinary support
of provost, Rick Locke, who
wanted to be here
today but could not
and President Christina
Paxson and the rest
of the administration.
Their encouragement and
blessing of our interest
in working collaboratively
has provided
a remarkable opportunity
for all of us
to explore beyond the walls of
our normal disciplinary silos.
I'm confident that
this vision will
be one of President Paxson's
most enduring gifts to Brown,
and I'm thrilled to have
her here with us tonight.
PRESIDENT CHRISTINA
PAXSON: So thank you, Jeff,
and good afternoon.
Welcome, everybody.
It's great to see you here.
This is a big event.
This is a big deal.
This is the launch of the
Data Science Initiative.
It has the feel
of something that
is potentially
transformative for Brown,
and I'm excited about it.
I hope all of you are too.
Before I say anything else, I
want to recognize just some--
and I know I can't get them
all-- some of the people
at Brown who made this possible.
I recall sitting in my
office around a table
a couple of years ago, and
it wasn't that long ago,
with chairs of four of
Brown's strongest departments,
then chairs, Jeff Brock, for
math; Ugur Cetintemel from CS;
Constantine Gatsonis, in
bio-statistics, and Bjorn
Sanstede in applied math.
And this core group had
come together with this idea
that they were going
to work together
across departments and in
collaboration with colleagues
from around the university to
develop this new initiative.
And the first step
at Brown is, OK, you
want to do this, put
together a great proposal.
And they went off and very
quickly and very efficiently--
maybe you'd done
already, I don't know--
came back with a great proposal.
And so we were off.
This core group, again, in
collaboration with many people
from around the university have
developed and conceptualized
what I think is a
tremendously exciting program.
I do want to thank
Jeff especially
for taking on the
leadership of this program.
His work has resulted
in very bold vision,
and it's a shared vision and
one that I'm very excited about.
I also want to thank you, Andrew
Moore, who I will introduce
shortly for being here
today to deliver the keynote
and all of us are
looking forward
to hearing your remarks,
so thanks for coming.
So just a few words about
what this initiative
means for Brown.
And I think back to a cabinet
retreat that we had over
the summer where we had leaders
of different initiatives,
academic leaders at
Brown, come and talk
about what was happening
in their worlds,
and I can see many people
here in this room who
spoke to us that day.
And Jeff made this
dazzling presentation
on the DSI, which builds, again,
on these longstanding strengths
at Brown but is integrating them
and taking them to a new level.
And he highlighted just
the enormous potential
of data-enabled
research and learning,
and he described some
of DSI's early progress
and accomplishments so far.
So we learned, for
example, about ongoing work
using algorithms trained from
large data sets of CT images
to develop 3D mappings of
traumatic brain injury.
We learned about
remarkable research
taking place at ICERM, which
is our NSF-funded Institute
for Computational and
Experimental Research
in Mathematics directed
by Brendan Hassett
and founded by our VP
of research, Joe Peifer.
And the data science research
that's being conducted there
is extraordinary, and
the work just last summer
is likely to yield seven new
publications in the field.
We learned about the
new master's program
in data science,
and I can tell you
I've been talking to
a lot of students,
Brown undergraduates,
who are now
eyeing this is a fifth-year
option, so watch out.
They're coming, and
they're excited about it.
And we are reminded of
this successful recruitment
of people like Ali Momeni,
a gifted scholar and artist
who uses machine learning
to explore aesthetics
in sound design and
musical invention.
And one of my great
excitements is just
watching the types of people,
the types of great scholars
and researchers
and teachers who we
can attract to Brown
because of the excitement
of this initiative.
So no question.
This has already hit
the ground running.
It feels actually in some
sense, like a launch.
You've been doing a lot.
But we should have a launch.
It's the right thing to do.
And I think it's
important to note
that strengthening Brown's
capacity in this area,
taking advantage of
the data revolution
to advance knowledge
and discovery
is central to Brown's strategic
plan, building on distinction.
It's written about in there
and in very few sentences,
but you can really see the
seeds of it having been embedded
from faculty discussions that
happened four or five years ago
and are now coming to fruition.
The Data Science Initiative
is Brown's definitive response
to a seismic change in how we
confront data in the world.
And this isn't just
happening at Brown,
it's happening everywhere.
And it matters.
It matters a lot
because the advances
it will introduce
in how we gather
and curate and understand
and deploy data are exploding
the research landscape
in higher education,
and they are
transforming industry.
And they're creating
new opportunities
for research and industry
to work together,
which is also very exciting.
The mathematical and
computational tools
that we're unleashing on every
kind of intellectual inquiry,
and the breadth
is extraordinary,
holds the promise of new
methods of research and order
of magnitude insights
into pretty much anything,
when you think about it.
At Brown, these
insights are guiding us
in developing a computerized
framework for identifying
and cataloging potential
therapeutic uses of plants,
thereby, encouraging
biodiversity.
They're enhancing efforts
at the Large Hadron Collider
to discover new particles
of the early universe
by using algorithms
to classify signatures
and high-rate collisions.
And again, blending,
another strength at Brown,
theoretical in experimental
physics with data science
is really exciting.
Outside of Brown, but with
some brown involvement,
we're doing work at divining
the patterns and providence
of Shakespearean plays via
feature frequency profiling,
which is a text mining message.
So data sciences and the
humanities, very much something
that is generating
exciting work.
So all this is going on.
What in an overarching way
does DCI mean for Brown?
What do I think it's going
to do for this university?
And I think there
are three things
that I want to highlight.
The first is that it expands the
capacity of Brown scholarship
to have impact in the world.
And impact is something
that we care a lot about.
We're doing work that is going
to catalyze work in brain
sciences that help us analyze
the massive amounts of data
that are coming out there and
help us move on to treatments
and cures for conditions
like ALS and Alzheimer's.
If we can do that,
that's impact.
And it's this kind
of powerful outcome
that really propels, I think,
a lot of people in this area
to do their work.
Second, the DSI
affirms and confirms
Brown's collaborative approach
to advancing knowledge,
and Jeff talked about
this a little bit.
We have a number of these hubs
of knowledge research education
across the university,
and this one
will focus on foundational
methods in data science
and their application
in virtually any field.
So the ability of
this initiative
to connect and collaborate
across the university
are really quite outstanding.
And then finally, I do have to
bring it back to our students.
So third big impact, data
fluency is really important.
And the DCI ensures that
data fluency will certainly
be more than just a simple
regimen of tool-based skills
that students learn.
Those are important
skills, but they're
going to depreciate
quickly over time.
But I think what they're doing
is inspiring students to think
about how the
applications of data
can be really a core competence
for innovation, for discovery,
for application in areas
ranging from medical research,
but also in areas like public
policy and civic engagement.
And this is something
that I know excites Brown
students very much.
So my hope is that every
student who comes into Brown
takes one of these great courses
that you're developing, maybe
in connection with another
department of Brown
and leaves this university
really understanding
the power of data and the need
to understand data science
methods to some degree.
So let me wrap up.
This really is a milestone
day for the university.
I think that this
initiative positions Brown
for leadership in the field.
It opens up opportunities
to connect us
to industry going
forward, and it
is the essence of
what we want out
of the building on distinction
iterative themes by marrying
rigor with implementation,
enriching our very distinctive
approach liberal arts--
liberal art-based approach
to research and education
and improving the prospects
for Brown scholarship
to have a meaningful
impact on the world,
and that's what
we're here to do.
So I'm very happy
about this effort.
So with that, let me
introduce our keynote speaker.
I had a lovely conversation
with him this afternoon.
He is, in a word,
passionate about the impact
of technology and computer
science and data on the world.
Andrew Moore is a distinguished
computer scientist
with expertise in machine
learning and robotics.
His research has included
improving the ability of robots
and other automated systems
to sense the world around them
and respond appropriately,
a theme that I
hope he will touch on today.
He's dean, so he still does
research, but he's also a dean.
That's pretty cool.
He's dean of Carnegie Mellon
School of Computer Science.
And in that role, he presides
over this growing suite
of hubs that continue to
mainstream data science
and forge new applications.
So these hubs include
the Robotics Institute,
the Language Technologies
Institute, the Machine Learning
Department, and
the newest CMU AI,
which provides a platform for
ongoing research and education
in the field of
artificial intelligence.
Now, his presence at Carnegie
Mellon, and in Pittsburgh
more generally, has
accelerated the transformation
of Pittsburgh into a
magnet for tech growth
and entrepreneurship.
And speaking
personally as someone
who grew up in Pittsburgh during
the very dismal, dark days
of a dying steel industry,
seeing what's happening there
now is just amazing.
So it's remarkable to
see what CMU has done.
It's quite an accomplishment.
And Andrew's been
a big part of that.
His experience as founder of
Google's Pittsburgh office
and now a mentor of
tech talent has really
helped to cultivate deep and
productive economic links
between the university and the
greater Pittsburgh community,
which is quite remarkable.
In a New York Times
piece this past July,
Andrew said of the milieu at
CMU School of Computer Science
at around Pittsburgh's
very dynamic tech
sector that is quote,
"like being at Hogwarts."
Now, I don't remember seeing
many computers at Hogwarts.
I did see a lot of
magic, and I hope
that's what you
were referring to.
And interesting and novel
ideas emerge routinely
as entrepreneurs and engineers
and computer scientists
and artists are working together
in unprecedented ways that
benefit the city and the region.
So I couldn't think of
anybody more appropriate
here to help us launch our
data sciences initiative.
So with that, I'd like
you to please join me
in welcoming our keynote
speaker, Professor Andrew
Moore.
Thank you.
ANDREW MOORE: Thank you
very much, President Paxson.
And I am, of course, honored
and delighted to be here.
I think what you're doing
is, indeed, very important.
It's important for
the world as a whole
that you are training an
entire generation of students
to be comfortable and
competent with data.
And also, one of the reasons
I'm so proud to be here
is because when I was
in industry and looking
desperately to recruit people--
and I will talk more about
the talent wars later on--
we definitely had
this experience
that when you hired
someone from Brown,
you didn't just get a
smart, genius-like person,
you've got people who
could work with other folks
and take on the really quite
frightening responsibilities
that you have these
days if you're
working in a major internet
company, health care company,
transportation company.
We need our engineers
who are running the world
to be good, well-educated,
general well-balanced folks,
because else, we're
all in trouble.
So I have very great respect for
what this institution is doing
to create its various students.
All right.
So what I wanted
to talk about today
was the life of what seems
to be going on in the world
right now around
the use of data.
And it is becoming
very competitive,
and there's a lot at stake, both
between countries, certainly
between companies.
And actually, if we
handle this very badly,
it can lead to a greater
breakup in society
when the rich technologists,
like ourselves,
educate our kids to be the next
generation's rich technologists
and the have nots,
who are not getting
a decent educational
opportunity end
up not joining in our world.
So there's many places right now
where there's a lot at stake.
And I'm going to
just try to give
a map of what we at Carnegie
Mellon see happening here.
So that's going to be the talk.
I'm first going
to talk about how
the world of computer science
views the role of data science.
And then we'll use
that as the backdrop
for discussing these important
areas about the talent wars,
the question of what
happens with computation,
the importance of
privacy in data science,
and then, perhaps
most exciting of all,
what this could actually mean
for the natural sciences.
And needless to say, I
am very, very welcoming
of interruptions with
questions or comments.
So it's much more
fun, I think, for me
and for everyone else, if
I say something stupid,
call me out on it.
We'll have a very
entertaining experience.
So please do not be shy.
We don't have to wait till
the end for questions.
All right.
So here's a backdrop with a very
Carnegie Mellon centric view
of the world.
In 1965, Allen Newell
and Herb Simon,
two of the heroes in the
history of Carnegie Mellon,
were very interested in now
that we have compute engines,
does this mean that we can
start to simulate intelligence,
use simulated intelligence
for the human good,
and maybe even
understand intelligence?
And the culture at
CMU at the moment,
there's 240 faculty in the
school of computer science,
are all being brought together
still by this important notion
that Alan and Herb gave to us.
And the succeeding
generations of heroes,
people like [? Ross ?]
Ready and Takeb Conally
and Tom Mitchell
and Jaime Carbonell
really helped us grow this area.
All right.
So here's where they stood.
They looked at humanity, a
good example of intelligence,
and they said, what's
really going on?
One, we perceive stuff.
Two, we decide to do
about what we perceive.
Three, we act on it.
So this is what you might
call artificial intelligence
pre-101.
This is this big thing, which
has driven thousands of people
in their careers over
the last half century,
understanding this stuff.
And for the most
part, it's gone well,
but it went through a
very circuitous route.
And the security of
its route is, in fact,
what makes me so excited
that you guys are launching
this Data Science
Initiative today
because it is the most
important component.
All right.
So let's talk about this.
AI was bubbling along
pretty well, even back
in the mid '90s when
computers really
sucked in terms of
computational power.
The many new ideas
have come along
to help this understanding
of how to simulate and maybe
actually take advantage
of simulated intelligence.
And one of the
key points I think
was in '97, when Kasparov
was defeated in chess.
This made us all think
AI is really here now.
We're in good shape.
So that was good.
It's good for the computers,
I guess, not necessarily
for the humans.
But that was 20 years
ago, and we're not
living in an AI-controlled
world, so what
actually was going on here?
During this time
period, many of us
were really focused on
this decide question of how
we actually search, and
with something like chess,
we can exactly model
predicting what's
going to happen if I try to move
a pawn forward, a pawn moving,
move forward.
So chess is one of the class
of problems closer to puzzles
where we have a precise
model of the world.
Of all the things that the
computer has to worry about,
it doesn't have to worry about
predicting what happens next.
And so of course, we
have wonderful algorithms
and hardware to just say, if I
do this, then this will happen,
but if I do that, then this
other thing will happen,
and so on.
Back in the late '90s, we would
do that about 50 million times
during a decision process
for a game of chess,
but that was really
what the algorithm was.
And many of us in
computer science
worked very hard to disguise
from the rest of you
how easy this stuff is.
We just program our computer
to ask lots of questions
and then choose the best.
But to everyone's
disappointment,
this didn't continue to work.
And of course, the killer was
that, unlike chess, most parts
of the world when we're
automating a decision,
we don't know if I do this,
then that will happen,
and if I do this other
thing, then that will happen,
and so forth.
We do not have this model
of the world to use.
And so that was, not
just a small problem,
it was a huge problem.
Initially, the way
that we were trying
to implement this predict the
next step was to hire software
engineers to write things
often called rule-based systems
or expert systems
to allow us to have
a reasonable job of
predicting what happens next.
And in a few cases, like laying
out a microprocessor circuit
or scheduling inventory
for the space shuttle,
you actually did
know pretty much
exactly what was going to happen
and so you could use this.
But so many interesting
things, especially anything
involving being
useful to people,
we were not able
to predict exactly
what was going to happen.
And so that's when lots
of us in the area of AI,
we were almost like the
passengers on a cruise ship
when someone shouts
there's a whale over there.
Everyone rushed to the
question of how do we
derive models,
which allow our AIs
to be able to predict the
effects of the actions
if we can't hand-code them?
And that was the huge
growth for machine learning.
It was so important, actually,
that pretty much everything
else was kind of stuck
until we solved this.
And so many of us
jumped into this
and spent at least a decade
sometimes up to two decades
really trying to
figure that stuff out.
And fortunately, as
the years went by,
progress was being made here.
And that is what's allowed us
now to start having systems
like a system which decides
which web results to show you
when it's choosing between--
usually it chooses
between 100 million
to a billion different options.
And for all of
them, it predicts.
Given this person
asked this question,
what is the probability
that they'll
find this result useful?
And there's no
handwritten things there,
it's been learned from data.
And throughout the
transportation world,
the health world, the
financial world, the military,
this trick of
embedding something
which can predict what happens
next, this trained on training
data, has allowed us to
do a much more effective
perceive-decide-act
system where the decide is
driven by machine learning.
So when we at CMU have to
design curricula or design a way
to organize ourselves
so that we are covering
every aspect of research
that we need to,
we try to understand
how to break up
the different disciplines that
students need to know about,
and we need to make sure that
we're making moves forward.
And of course, in
the technology world,
we talk about these
things as a stack.
That is, there's a
piece of technology
which helps other
pieces of technology
in the stack above it.
And this piece of technology
has helped, in turn,
by stuff below it.
And the thing which had been
missing for most of us in AI
have been these layers
related to learning from data.
The very bottom of the
devices which power
all of this, including
the network sensors, which
allow us to do much more
effective perception now
than we were ever
able to do before.
And then on top of all of
this is act and decide.
And for this real dream of
useful autonomous systems,
which are trying to either
simulate or simulate
intelligence or
actually be used to make
important decisions
in the world,
we need all of these components.
And whether you're
working at one part
of the stack or
another, you really
need the other bits in the
stack above and below you.
That means everyone
needs each other
if you're going to build
useful things here.
You can't get away with a bunch
of lone-wolf geniuses trying
to do their own thing.
Like many other
areas of technology,
we are now so complex
that we have to rely
on other folks in the stack.
So what I'm going to do now
is quickly introduce you
to the components of this stack.
Many of you will
recognize your own layer,
but as we talk
about it, the theme
I'm going to try
to really push is
how we have to be helping
the people above and below us
on the stack.
And the folks at the
very top of a stack
were the important use
cases, and your president
mentioned some important
use cases of faculty working
in the DSI, those are
what gives real meaning
to everyone working
further down in the stack.
So let's begin at
the device layer.
The push there is to be able to
sense the state of the world.
The more you can sense it,
the more information there
is for the decide component.
A good example of this
is, with computer vision
and other sensors, now around it
covers about four square miles
of Pittsburgh.
Our traffic lights are
watching every car individually
instead of just
counting traffic.
And then the way
that we schedule
traffic lights is a optimization
problem where we can actually
plan for the movement of
every car individually.
So at any given time, the search
space for these traffic lights
is now a few thousand cars all
throughout a region of the city
to work out what's
best to figure out
what to do to help all them
based on predictions of where
we think they're likely to go.
This, so far, has,
at this part of town,
reduced drive times by about
20% and admissions by about 23%.
So those are both nice
results out of this.
Another example, which is
going to come to a theme
to do with privacy and a
potentially Orwellian future,
but which is so cool is the fact
that, thanks to improvements
in cameras and other
optical technology,
we can do things which
in just a decade ago,
we never tried to do.
So the question of
looking at someone's
iris, that sounds like a
reasonably conventional thing.
Some of our faculty working
on the gimbaling system
of a camera, which is able
to move the camera very, very
quickly to point exactly
at one particular point,
it's now 80 feet away,
zoom in and get an accurate
picture of a person's iris.
This is now under trial
on military vehicles
for being able to just look
at everyone within about
50 foot of a military vehicle
and identify them by iris.
That's exciting,
Orwellian, but exciting,
and it's very
interesting technology.
There, the reason
we can do this,
again, is improvements
in sensor technology.
The frame rate of these
cameras, the ability
of the gimbaling to get
exactly to the thing
that you need to focus on.
There's a picture of
this beautiful thing.
At the moment,
it's rather large,
but that's one of the next
things next we're going to try.
Other examples, of
course, are in 3D mapping,
an area where Brown has
very strong faculty as well.
Here, part of the
trick was primarily
to do with the data
processing layer,
the ability to perform this
operation called particle
filter localization
on small compute
instead of having to send
it to a server in the cloud
to run the big computer.
And a final example
of this is something
which has led to-- whoops.
Let me get this going.
I'm having trouble
finding my mouse.
There we go.
It's a technique known as
inverse reinforcement learning.
So you have one of the
giants of reinforcement
learning on your
faculty, Michael Littman.
Inverse reinforcement learning
is trying to undo all his work.
And what it is is
actually figuring out
what all the players in a scene
on a streets or anywhere else,
what their likely
intentions are.
So you can figure out what is
the most likely path they're
going to take, and
then there are many use
cases to do with
noticing people who
are doing very
surprising things given
what we know about them so far.
And also taking over control
of vehicles in situations
where there's likely
to be an accident.
So that's nice.
We feel very happy at
the perception level,
but perception by itself is
not usually very interesting.
You've got to do
something with it.
And that's where
operating systems, people
who design data centers,
people who design chips,
are incredibly important.
And they are doing essential
data science research
when they're doing this.
Here's an embarrassing story.
I, at Google, one of my jobs
was to help build the big engine
machine-learning systems, the
things which were actually
having to deal at the
scale of tens of billions
of events every day.
And we were very proud of
the system we architected.
And it was well worth it.
We spent a lot of money with
our machine-learning system,
but it was extremely
good for the users,
and it was very good
for Google's revenue.
And then, having built
this amazing system,
I came back to Carnegie
Mellon, very proud of myself,
and discovered that
someone had built something
with the same
computational capabilities
with one hundredth
of the compute power,
and this is billions of dollars
worth of computer power.
And it was a perfect
example of why--
and it's me who
looks stupid here.
It's a perfect example of why
industry cannot survive just
by pushing on by itself,
throwing money at a problem,
and not looking at the
crazy ways of finding some
alternative way to
solve the problem.
In this case, in a nutshell--
I'm not going to try to
read this big slide--
the key insight is that when
you're doing big machine
learning and you're doing all
the stuff with distributed
data, you don't have to
obey the normal rules
of a sensible computing
system to deal
with putting the correct
locks to make sure
that the right message
has gone to the right node
at the right time.
If you don't bother
to do that and you
start to violate
all the things we
learned not to violate
in operating systems,
then the wonders of the
stochasticity of the training
algorithms that we're working
with allow you to actually show
that, by the beauty
of linear algebra,
that as long as you don't
make too many mistakes,
you're still going
to converge, and each
of your atomic operations can
be dozens of times faster.
So that was a very interesting
and important lesson
that low down in
the stack, you have
to have people always looking
out for ways to improve that.
Having this ability to
train much more cheaply
is, it's not only important
for the small companies that
could never have
afforded to build
a Google-sized infrastructure.
Within the huge
organizations like the NSA,
like Google, like
Facebook, it means
that folks who are trying
to invent new things
can suddenly run a
hundred experiments a week
where previously they
could only run one.
And so the whole rules
of the game change.
It's a great respect, the
operating systems parts of data
is science.
Now, as we move up,
many people in this room
are familiar with
machine learning.
So what I wanted
to remind you here
is that there's still big, wide
areas of our modern-day lives
which are just waiting
for someone to get around
to trying out machine-learning
systems sensibly with them.
This was one of the really novel
ones which took me by surprise
about 18 months ago.
Up to this, as we
go up the stack,
we come into many parts
of statistical analysis.
Many people will arrange
this differently.
If you look at
machine-learning algorithms
as the brute-force thing
where we write down a
generated model of some kind.
And then we say, optimize
the heck out of that,
find me the maximum
likelihood set of parameters
for the states or do a big
particles filter or MCMC
simulation and figure out
something about these things.
That's machine learning.
Then you bring in the
statistical expertise
on top of that to help make
fine-grained decisions.
One example here based
on very accurate tracking
of the patches of
skin on a face,
you can work out what the
muscles underneath the face
are doing, and then
you can take that back
to areas in psychology to deal
with facial action units, which
are demonstrating certain
kinds of emotions.
Such as we have an
involuntary contempt action
where back sides of our lips
go back very slightly if we
have a feeling of contempt.
And we can't usually even
detect that consciously,
but you can, with a high-frame
rate camera, get that.
So one of the outputs from
this was the ability now,
when we are looking at
a cohort of patients
who are being treated for
depression, in a science
article 12 months
ago, the authors
were able to show
that they can detect
whether the treatment
for depression
is having an effect about six
weeks before the physicians
or the patients in
terms of indications
from this very fine-grained
analysis of facial action
units.
So that's interesting.
Now, we're going to
quickly go up the stack.
When we come up to the
decision-support layer,
the really interesting
places where you're not just
dealing with a room full of
statisticians and computer
scientists, you now are bringing
in the social scientists,
the physicians.
The law enforcement agencies
is helping make sure
that we have a system in
place for making decisions
like who are we going
to interview when
a crime has been committed.
And I know that
many of the things
I'm talking about sound
frightening life-and-death type
situations, but that is
the way the world is going.
We folks who are
helping with data
are also taking
responsibility for helping
in life-and-death decisions
make the world better.
In this particular example,
it was an implementation
by one of the master students
in one of our own data science
masters programs who got very
interested in the question
of facial recognition in escort
ads on the big search engines,
combining that with
some other information
to do with linking
together pieces
of evidence in a network.
And this is now being deployed
to a few hundred police
stations around the country.
At the moment they're
credited with about a hundred
rescues every year of
folks who are involuntarily
part of some sex
trafficking or escort rings.
So that's both an
exciting outcome,
and once again, it's
not just this gang
of people who built that layer.
They are building
on technologies
further down the stack.
So Eric Xing, who helped with
the very fast implementations
of big, deep
learning, he can enjoy
some of the credit for
this excellent use case
because that's what allows
the statistics to run
at a reasonable rate.
Now, I'm actually going to
move outside the Data Science
Initiative now.
And of course, it doesn't
really matter to anyone
whether something is officially
inside or outside it.
But I want to really
emphasize how--
data is wonderful,
but data helps
us model the world
in a way which then
allows us to make decisions.
And at the moment,
we've been talking
about humans making decisions,
like law enforcement agents.
But of course, many
of us and the dream
of Herb Simon and
Allen Newell was
to have automated
systems making decisions.
And there, when we look at
how we structure education
and how we hire faculty,
we break this top area
into the infrastructure
technologies, which are usually
to do with searching and making
inferences over the models
that we've constructed and
then the end use cases.
And it's important that
we distinguish these two
classes of end use case.
The one which people usually
think of when they think of AI
is autonomy, and we
have plenty of examples
of situations where you have
to build an autonomous system.
If you've only got
quarter of a second
to make a decision
on how you're going
to manage an inevitable
collision in a car,
then the human consciousness
or the human brain
does not have enough cycles to
actually do anything useful,
whereas a computer can easily.
If you eventually find you
need billions of robots
around the world helping
with farming and agriculture
because we have to make
10x the effective use
of the available soil, which
is remaining on the world,
again, with a
billion robots, you
can't have them
controlled by people.
They have to be able
to act autonomously.
But then there's the
other part, which
is the case where AIs
are just helping humans.
The simple examples that
we're all familiar with
are things like using GPS
to help us navigate around,
or OK, Google, I need
a new pair of sneakers.
Those kinds of things.
So let's talk about
what's exciting in the AI
infrastructure piece first.
These are some of the
active areas which actually
cross among many use cases
and provide that layer down
to the predictive models.
Optimization is-- it's funny.
Optimization is cool again.
If you look at what goes on
with retailers like Amazon,
even folks like Google who are
involved in logistics and then
certainly things
like drive-sharing or
transportation-sharing
systems, suddenly everyone
is looking to hire the
folks who are actually
comfortable building very large
linear programs or integer
programs.
So dust off those 1970s
textbooks on numerical analysis
because they're cool again.
The next thing which is very
important and it's an area
where we are investing
because we think
that both the commercial
world and the military world
is under-invested
so far is in safety.
And here's one example.
These are parts of
a military system
called the TARDEC system,
which is a moderately clever AI
system.
It's only moderately
clever because it
has been due to go into
deployment for the last three
years, so it's not the
most advanced thing.
What it's doing is
intelligent convoying.
So you still have a human
driver in the first truck.
In the remaining trucks,
they are autonomous.
And that, of course, is
a less difficult problem
than general autonomous driving.
The thing which has
held us up from release
is it's virtually
impossible to test
using any sort of industrial
or military test procedures.
Why?
Because it's built on modules
which are learning over time.
And at the moment,
we have almost no
known technology,
certainly not in academia,
for coming up with performance
guarantees for something which
is going to be learning
to improve itself.
This is a threat.
I've tried to emphasize
carefully how so much
of what we're doing to
design the next steps
of artificial intelligence
depends on learning,
and yet, in safety-critical
systems up to now,
we've always insisted that
our engineers and architects
and biomedical folks come
up with some kind of proofs
of the safety or
efficacy of that system.
So this is a growth area
for academics to work on.
Some are doing it by
statistical approaches
to carefully designing
experiments to try deliberately
to fool the
machine-learning system.
That gives you one
kind of test guarantee
which is more of a
statistical guarantee.
And the other group are
going back to an area,
which many of us remember
from the past, formal program
proving methods, nowadays,
including geometry and dynamics
and the dynamics of
a learning system,
to show that in the worst case
something bad can't happen.
This image is related to work
by Andre Platzer, who came up
with a proof system for showing
that the new FAA collision
avoidance system, which is
going into production this year,
can never cause a collision.
He tried to prove
it, was unable to,
and then discovered
a whole set of bugs
which have now been fixed.
The next area where
we're still worried
in the world of
artificial intelligence
is what we call the
knowledge network.
When IBM won Jeopardy, it was
based on scraping Wikipedia
and a bunch of other general
knowledge sources for facts
where you can identify
the entities involved
in the question and then have a
reasonably good chance of being
able to look up any question
about the relationship
between pairs of entities.
That was very exciting.
That is exactly the architecture
inside Cortana and Siri
and Google now at the moment.
But we, outside of
those corporations,
are almost unable to do
anything useful at the moment
because we do not have the big
databases of all the entities
and relationships between
them that we need.
So a big push--
this is actually happening
in a couple of weeks--
in DC is a group of companies
and all of academics and then
some government departments
that are getting together
to try to take what Google
and Amazon have done
for their respective
knowledge graphs
and start to build them for all
the other parts of the economy,
financial, government
regulations, health
care, education.
So one of the goals here
is that instead of just
being able to ask,
OK, Google, where
is a good place to eat tonight
three miles from here, which
Google can answer
very well for you,
to be able to answer a
question like OK, Google,
is my mother going to be covered
for her pancreas treatment
if she switches to
this new health plan?
And to do that, there's
very creative work
needed to take each of
the entities involved,
put them into a structured
from which we can then
build the answers to.
Finally, I wanted to
mention game theory.
A similar kind of break through
to AlphaGo happened in February
this year when Thomas Sandhorn's
algorithm managed to beat
the world champion's
no-limit Texas hold'em poker,
which was unprecedented before.
And it put to pay once
and for all the idea
that it's something to do with
assessing human emotions, which
you need to play poker well.
Other areas where game
theory is making a big impact
and it's based on
machine-learned models
are in various forms of
exchanges, such as the kidney
exchange, by Thomas, which are
able to rationally put together
people with different needs in
a way where every participant is
happy that they have the
right solution for them.
This is currently known to have
saved about 300 lives since it
was introduced by far
more efficiently dealing
with the question of, if I
want to donate to my daughter
but I'm not compatible, can
I do a deal with someone
else in a similar problem.
And if that doesn't work, can
we do three, four, five, six,
seven, eight-linked deals
cleverly invented in such a way
that no participant will want
to do a different deal in order
to get a better solution.
Another important
example, Amy Greenwald,
the faculty member in your
computer science department,
I doubt she even knows
this, but her paper
on how to combat the rather
dubious behavior of things
called penny
auctions, which were
a big scourge on the
internet for a while,
was extremely important for
a game theoretic analysis
to help figure out better
rules of the game, which
are less predatory.
Because I don't want to
be coughing and wheezing
all over you guys
for too long, I'm
going to quickly finish
up with the top of this
and then jump to our
final conclusions.
Here's a few things.
Autonomy is very
exciting, and this
is an area where there is very
few people with the skills.
The kind of thing which goes
on with autonomy at the moment
is at this level.
You build a system,
in this case,
it's controlling one
side of this soccer team.
And at this point, the
system has made the decision
to send that defender
to that position
in order to get ready
to take another goal,
and it shoots and scores.
So here's what's
fun about this area.
We've got a few hundred
people around Pittsburgh
who know how to build these
systems, and it's an art.
There is no science.
There's no version of software
engineering methodology
for doing this.
Folks know that you compose
an autonomous system
into a series of cascaded
behaviors below them,
which often talk to
behaviors below them.
And we all mix these
systems together
based on our
experience in the past.
So the big challenge here
is how to systematize this
so that we can have two or
three orders of magnitude,
extra people able to participate
in building autonomy.
The final area are things
involved with human's lives
being made directly better
by good advice from AIs.
And of course, this
is the big competition
which Google, Microsoft,
and Amazon are fighting over
with each other for their
spoken dialogue systems
to turn out to be definitively
better than the others
so that they can start to be the
entry point for people who want
to use artificial intelligence.
Among these kinds of things
are these awful-looking snake
robots, which are now
actually performing
surgery where they
insert themselves
through pretty much any orifice
they choose to very quickly get
to a place where
they actually do
need to manipulate the
tissue, usually take parts out
or implant things.
Jeff Huang, another of
your computer scientists,
has been leading the way
for understanding eye
tracking so that as we're
building systems where we're
trying to make people's lives
better through the way they're
using technology,
we can actually
try to figure out, for example,
if the mouse, their finger,
their eyes, and I hope
eventually their expressions
can help us more
quickly optimize
the performance of these things.
Now, what I want to
do now, I'm going
to finish up with comments on
these four fronts, the things
which keep me awake at night.
So first one, in talent.
There is a desperate shortage
of people with the skills
to work inside this large stack.
So really, godspeed to
you guys for building
this new master's program,
and my recommendation
is don't hesitate.
Grow this as quickly as you can
to a thousand students a year.
Here's one of the
issues and it's just
in one area of network security.
When we're in competition
with the other major powers--
and of course, China
is the one that we're
most concerned with--
it is very asymmetric
at the moment.
We have probably 2%
or 3% of the folks
trained to manually deal
with finding exploits
on the other side of the Pacific
and jumping in and taking
over systems and defending
ourselves from that happening.
Literally, we're training
hundreds of people
a year to be experts
in this, where
we believe China is training
tens of thousands of people.
So the place where we're
putting our big bet
is in the automation of this.
We're never going to
win as a numbers game,
so what we're trying to do
with our much smaller number
of engineers is
build the systems
which are automating the
behavior of these other folks.
So that is part of what's
happening in these talent wars
is, we've got far
fewer resources
and so we're trying
to deploy them
on an extra level of
autonomy and automation
in what we're doing.
Next, I want to show you this
example of a nice product.
It's pretty simple.
It's a nice climbing wall system
where it watches your progress,
and as time goes
by, it gives you
challenges, which
makes it fun to try
doing different
things on the wall
to exercise different skills.
So it's good.
It's one of hundreds of exciting
new products using computer
vision and some AI planning.
The punchline of this, why
this is an important example,
is this was built by a set of
students who couldn't program
12 weeks before.
And one thing that
we're learning
is that we don't have to
rely on training up people,
to take seven or eight
years of training
through undergraduate
and graduate studies
in order to participate in this.
Thanks to the design
of a stack, which
has component technologies,
nowadays if you say--
let me show you another
example of this.
I can talk while I'm showing it.
Someone says it would
be interesting to try
to train people in sign
language the same way
that we train people to
dance in dance revolution.
They don't have to
do a PhD to do that.
They can download
the limb tracker
and the front-end systems
for putting the computer
into the cloud and
build this, again,
after modest
amounts of education
and how to build these systems.
So this is another, I think,
inspiring things when we're
talking about the talent wars.
The fact that we
have this stack means
that you don't have to
be a guard at everything
in order to design and
create new products.
And really the
punchline that I'm
trying to really push
heavily in this talk
is that we all
depend on each other.
This is why it's
so nice that you've
taken superstar departments
within your university
and actually started to
have them working together
because it's a complicated
world when you have
some people dealing with
kernels of operating systems,
other people doing sociological
questions of how you
make decisions, and other people
building autonomy on top of it.
No one can do all of this.
You have to have
multi-disciplinary
or as was described
earlier, the double rainbow
of transdisciplinary
awesomeness or whatever it was.
There's one important
question, which
we don't know the
answer to as to
how should we be training
folks in this talent war.
Some universities have taken
the very rational approach
of saying, let's do
horizontal slices in here.
We're going to have a
master's in computational data
science, which is
all about those lower
layers of the stack.
And so maybe Phillipe
does that while Subha
works on the infrastructure.
On the industry's side,
when you're building a team,
you usually want to have a team
of only four or five people
to build any piece of
software technology.
You really get diminishing
returns after that.
But when you choose
your team, you really
want people who can work
together and bring together
complementary skills.
Part of the problem
with this model
is that unless your
team is very large,
you're not going to
cover the whole stack.
Maybe they can go and take
off-the-shelf components
from it, but this is not
necessarily the ideal shape.
This is producing a
bunch of specialists.
Another approach is to try to do
big courses or masters programs
which cover the whole stack,
a little bit of everything.
Then you get folks
who can work together,
but you don't
necessarily, especially
if you're fighting
against another company
or another government
with very strong people,
you don't necessarily have
the skills ready for that.
So this, which I'm sure many
of you have seen before,
this notion of
T-shaped skills is
what we on the software
management side love to see.
If we can find people who
have been trained enough
that they can respect
the layers of the stack
above and below them, willing
to talk their language
and work with them, but have
their own specialization,
that is wonderful.
If I have to put
together a computer
scientist and a statistician
who don't really
know about each other and
maybe the statistician thinks,
well, that person just hacks
up stuff but doesn't really
understand the math.
And the computer scientist
says, this person
just talks in MATLAB.
How can I even deal
with them, that's bad.
If on the other
hand, you have folks
who understand enough
to respect each other,
take advantage of each
other's strengths,
then you can build a good team.
Next question.
A very interesting one, and if
you look at the VC Investment,
you will see this phenomenon.
We're not sure
whether the next wins
near the bottom of the
stack are going to come
from hardware or algorithms.
For a long time, it was
algorithms winning each round.
And algorithms include
the use of category theory
and other very nice
mathematical constructs
from graph theory and topology.
What we've discovered
now though is
that with the birth
of deep learning,
we actually have the problem
that despite everything we've
tried, it's the fast
algorithms which are winning.
So we have to see
how that one ends up.
Moving onwards to privacy, I
can't encourage enough for us
to work on this.
One of our faculty
members, Matt Fredriksen,
has shown in a series of
very careful studies of FDA
experiments that the extent
to which we increase privacy
controls is good
for privacy, but it
is costing, in some
cases, thousands of lives.
And so the analysis of
actually the privacy tradeoffs,
rather than just assuming
that we want to have 0%
privacy or 100% privacy
is an important question
which is under
investigated in the moment.
And finally, I didn't have
time to do this topic justice,
but the question of using these
autonomous and AI assistance
in the natural sciences
is very exciting.
Part of the reason
this is so important
is that between them,
the internet companies
have invested tens of billions
into the question of how
to most efficiently
gain market share,
get people targeted to
the right advertisements,
and create advertising campaigns
which have good outcomes.
That is greater than the
investment and the National
Science Foundation
and NAH in dealing
with scientific
experiments and how
to most efficiently
optimize those questions.
If we take the
sorts of technology
that so many of us in industry
have used for optimizing
campaigns and apply that
to optimizing compounds,
we're very excited about what
the future is going to be like.
So that's the whole story.
I was trying to
show you the map,
show you how we've joined
everything together,
really emphasize the
beauty of what you're
doing here at Brown to bring a
bunch of disciplines together
and express my appreciation
for what you're doing here.
These references to
everything you've seen
will be available on my
website on the FAQ page
if you want to follow
up on any of this stuff.
But in the meantime, that's it.
Thank you very much.
JEFFREY BROCK: I
don't know if you have
time for a question or two.
ANDREW MOORE: If the audience
has, we might as well move on.
JEFFREY BROCK: So if
there's any quick questions.
Yeah, Chris.
AUDIENCE: You'll repeat it?
JEFFREY BROCK: Yes.
AUDIENCE: So you
have a stack there.
And at the bottom of stack was
basically data acquisition,
and there were metrics that
you could choose to look at.
It's almost like the
subconscious of the system.
How do you make sure that the
subconscious is representative
of enough that's out there?
That kind of depends upon
what people choose to look at
and what sorts of things
they choose to look at,
what metrics actually
get designed.
So how do you--
JEFFREY BROCK: Do you want
me to repeat that, Chris?
AUDIENCE: Well, [INAUDIBLE]
ANDREW MOORE: This is a
really important question.
That is, if we work so
hard on decide and act,
but our perception is actually
tracking the wrong things,
then we could be making
very serious mistakes.
A great example of this is
actually in the web industry
where so much of our work
was based on the assumption
that if people
click on something,
we've shown them
something useful.
And that was the perfect
example of the one thing
you can easily get
out of web analytics
is what URLs people
have clicked on.
So you can optimize with that.
But it didn't take long before,
we started to realize, wait,
a short click followed
by a Back button
is probably a really bad thing.
So we should figure out
what to do about that.
But then we have to say, but
is it a really bad thing?
How short is bad?
Then we have to use machine
learning at this lower level
to take stuff that we can
observe in a user experience
testing thing to try
to figure out what
makes folks happy or unhappy.
In the big companies,
this includes now
transportation companies, car
companies, heavy equipment
companies.
There are now
usually groups called
user happiness groups
or user experience
groups who are actually
devoted entirely
to figuring out what is
making our customers happy,
want to return,
and in some cases,
even that question how you make
customers return frequently
to your service
is not necessarily
doing what the customer needs.
AUDIENCE: So [INAUDIBLE] That's
the cause and effect access
to the internet and
actively using the internet.
There's probably a whole
pyramid of folks below them
that didn't have
access, so you weren't
compromising [INAUDIBLE]
with them necessarily.
So that's the sort of
thing that you wonder
about when you're building.
That's the hard-based
stuff that [INAUDIBLE]..
ANDREW MOORE: Yes.
So it would really help a lot
of our products and services
if we could get an actual
measure of how much serotonin
is in your brain right now.
And it would really help me
on designing useful talks
to be able to, afterwards,
analyze a video of everyone
and say, this
person was smiling,
but they were smiling
with pain at this point.
And so getting that extra
data would be great.
A hellish possible future is
one where the large corporations
and the governments have that
data, and the rest of us don't.
AUDIENCE: Andrew, the project
that you mentioned about this
is open knowledge graph, I
think that's super exciting.
And there's this
piece that you will
be able to accomplish
thereby using information
from the public domain where
you build this [INAUDIBLE]
to the relationship
at a very large scale.
But in order to do
machine learning right,
you also need the
private information,
which is something that
Facebook and Google have,
and we don't have.
So is there a plan to
somehow get some sample,
private information so that you
can put the two pieces together
and build something?
ANDREW MOORE: Yes.
So for real knowledge
network, every record
has to contain provenance
information about who owns it
and what their data
rights are for using it.
A frequent comment
from the web companies
is, we would like to supply such
and such amount of information.
For example, no one would want
to not release information
about for every food
product that Amazon sells,
it would love to
have an open way
to publish all the allergy
reactions and other information
like that, which can
be for the public good.
On the other hand, there's
commercially-important
information which folks
have suffered and spent
a lot of money to obtain, such
as artful images of flowing
dresses that you're
about to buy,
where folks are
still going to want
to have data rights to that.
Then there are probably
some other information
which is very valuable
such as how often do 14
to 18-year-olds
visit this page three
times in a one-day session.
Which it's really
important commercially.
It's probably not the main thing
needed for the big knowledge
graph types of questions.
So primarily, I think
there are basic facts,
and then we hope there is
an industry of folks doing
proprietary versions for
sale of decorated entities
with further facts.
AUDIENCE: Yeah, [INAUDIBLE]
AUDIENCE: [INAUDIBLE]
So I wanted to ask you,
what's your present role
in the foreseeable
future for safety?
Are we going to be
getting answers primarily
from empirical
testing, or are we
going to be able to start
saying that something that
was built via such as such
a predictive model et cetera
has a chance.
In other words, if we don't
understand how neuron networks,
I mean, deep learning
actually works.
I mean, we do, but we don't.
There's no way to predict
whether if it's even monotone.
In other words, a system
that I have is monotone.
So it could go off the
deep end any time soon,
but theoretically, we
don't understand any of it.
I see this in
radiology, and everybody
is scratching their heads.
I mean, I think the thing
you mentioned with airplane
and testing it and so on is
how people are going at it.
But that's a very highly
dimensional situation.
Are we going to get some kind
of theoretical understanding
of some of this,
and if it doesn't
look like it's in the
cards for the next 5 years
or 10 years, what
is good enough?
ANDREW MOORE: That's a
really difficult question.
I would love to spend
15 minutes answering.
Let me just mention a few
quick things about it.
If you can simulate the
world of training examples,
for example, in a good
physics-based generator
of scenes for testing
out a self-driving car,
is it able to notice
a pothole in the road?
That is something where
physics are good enough,
but someone can write a
full testing simulation
to generate images to hurt this.
And of course, there's
many adversarial methods
where someone can
write a network
to try to come up with
plausible images, which
are most likely to confuse
the primary learning system.
So I think that if in an
environment where we're
able to say we can model
the set of possible worlds,
I'm pretty happy about that.
But if you are
doing something, say
on the internet at
the moment to predict
who's going to buy
what, for example,
it's actually not
uncommon for some big news
event or a hurricane
or Halloween happening
on a Saturday to
give a completely
different set of data than
anyone has seen before.
That can result in
poor predictions,
but that's not such a bad thing.
But it can result in the
stochastic gradient updates
starting to go crazy
under some circumstances
and actually have a big problem.
That's OK if your
website humorously
suggests a stupid
product for a few days.
It's absolutely
not OK if suddenly,
a whole bunch of vehicles,
emergency braking systems,
happen across the country.
So basically there's
an open question there.
In the world of non-safety
critical systems,
the big internet companies have
got very good empirical methods
for detecting and diagnosing
if a machine-learning system is
going crazy.
I would claim that
they are better
than us in academia by
several years doing that,
but no one, to my knowledge,
is able to come up
with a proof of safety, even
for linear regression, let
alone a deep network.
JEFFREY BROCK: With that--
yes, so I'd like to thank
Andrew Moore once again.
And I'd also like to
thank the Arts Initiative
for allowing us
to have this here
and to host the reception
up in Studio 1, which
is just upstairs.
And lastly, I'd like to thank
Joe Hogan and the organizing
committee for putting
together a great event today.
And I'm really grateful to
them for all their work.
So let's thank Andrew again.
