
English: 
SPEAKER: This is CS50.
[MUSIC PLAYING]
DAVID MALAN: Hello world.
This is the CS50 podcast.
My name is David Malan.
And I'm here with CS50's
own Colton, no Brian Yu.
BRIAN YU: Hi everyone.
DAVID MALAN: So Colton could
no longer be here today.
He's headed out west.
But I'm so thrilled that
CS50's own Brian Yu's, indeed,
now with us for our discussion
today of machine learning.
This was the most asked about
topic in a recent Facebook
poll that CS50 conducted.
So let's dive right in.
Machine learning is certainly all
over the place these days in terms
of the media and so forth.
But I'm not sure I've really
wrapped my own mind around what
machine learning is and what its
relationship to artificial intelligence
is.
Brian, our resident expert, would you
mind bring me and everyone up to speed?
BRIAN YU: Yeah, of course.
Machine learning is
sometimes a difficult topic
to really wrap your
head around, because it
comes in so many different
forms and different shapes.
But, in general, when I think
about machine learning, the way

English: 
SPEAKER: This is CS50.
[MUSIC PLAYING]
DAVID MALAN: Hello world.
This is the CS50 podcast.
My name is David Malan.
And I'm here with CS50's
own Colton, no Brian Yu.
BRIAN YU: Hi everyone.
DAVID MALAN: So Colton could
no longer be here today.
He's headed out west.
But I'm so thrilled that
CS50's own Brian Yu's, indeed,
now with us for our discussion
today of machine learning.
This was the most asked about
topic in a recent Facebook
poll that CS50 conducted.
So let's dive right in.
Machine learning is certainly all
over the place these days in terms
of the media and so forth.
But I'm not sure I've really
wrapped my own mind around what
machine learning is and what its
relationship to artificial intelligence
is.
Brian, our resident expert, would you
mind bring me and everyone up to speed?
BRIAN YU: Yeah, of course.
Machine learning is
sometimes a difficult topic
to really wrap your
head around, because it
comes in so many different
forms and different shapes.
But, in general, when I think
about machine learning, the way

English: 
I think about it is how a
computer is performing a task.
And usually when we're programming
a computer to be able to do a task,
we're giving it very explicit
instructions-- do this.
And if this is true, then do
that or do this some number
of times using a for loop, for example.
But in machine learning, what we do
is, instead of giving the computer
explicit instructions
for how to do something,
we, instead, give the
computer instructions for how
to learn to do something on its own.
So instead of giving it instructions
for how to perform a task,
we're teaching computer
how to learn for itself
and how to figure out how to
perform some kind of task on it.
DAVID MALAN: And I do feel like
I hear about machine learning
and AI, artificial intelligence,
almost always in the same breath.
But is there a distinction
between the two?
BRIAN YU: Yeah, there is.
So artificial intelligence or AI
is usually a little bit broader.
It used to describe any situation
where a computer is acting rationally
or intelligently.
Machine learning is a
way of getting computers
to act rationally or intelligently
by learning from patterns

English: 
I think about it is how a
computer is performing a task.
And usually when we're programming
a computer to be able to do a task,
we're giving it very explicit
instructions-- do this.
And if this is true, then do
that or do this some number
of times using a for loop, for example.
But in machine learning, what we do
is, instead of giving the computer
explicit instructions
for how to do something,
we, instead, give the
computer instructions for how
to learn to do something on its own.
So instead of giving it instructions
for how to perform a task,
we're teaching computer
how to learn for itself
and how to figure out how to
perform some kind of task on it.
DAVID MALAN: And I do feel like
I hear about machine learning
and AI, artificial intelligence,
almost always in the same breath.
But is there a distinction
between the two?
BRIAN YU: Yeah, there is.
So artificial intelligence or AI
is usually a little bit broader.
It used to describe any situation
where a computer is acting rationally
or intelligently.
Machine learning is a
way of getting computers
to act rationally or intelligently
by learning from patterns

English: 
and learning from data and being
able to learn from experiences.
But there are certainly forms of AI
of being able to act intelligently
that don't require the computer to
actually be able to learn, for example.
DAVID MALAN: OK.
And I feel like I've certainly heard
about artificial intelligence, AI,
especially for at least 20 years, if
not 30 or 40, especially in the movies
or anytime there's some
sort of robotic device.
Like, artificial intelligence has
certainly been with us for some time.
But I feel like there's
quite the buzz around machine
learning, specifically these days.
So what is it that has changed in
recent months, recent years that
put this at the top of this poll,
even among CS50's own students?
BRIAN YU: Yeah, so a couple of
things have changed, certainly.
One has definitely been just an
increase in the amount of data
that we have access to-- the big
companies that have a lot of data
from people on the internet that are
using devices and going on websites,
for instance.
There's a lot of data that
companies have access to.
And as we talk about
machine learning, you'll
soon see that a lot of the way that
these machine learning algorithms work
is that they depend upon
having a lot of data

English: 
and learning from data and being
able to learn from experiences.
But there are certainly forms of AI
of being able to act intelligently
that don't require the computer to
actually be able to learn, for example.
DAVID MALAN: OK.
And I feel like I've certainly heard
about artificial intelligence, AI,
especially for at least 20 years, if
not 30 or 40, especially in the movies
or anytime there's some
sort of robotic device.
Like, artificial intelligence has
certainly been with us for some time.
But I feel like there's
quite the buzz around machine
learning, specifically these days.
So what is it that has changed in
recent months, recent years that
put this at the top of this poll,
even among CS50's own students?
BRIAN YU: Yeah, so a couple of
things have changed, certainly.
One has definitely been just an
increase in the amount of data
that we have access to-- the big
companies that have a lot of data
from people on the internet that are
using devices and going on websites,
for instance.
There's a lot of data that
companies have access to.
And as we talk about
machine learning, you'll
soon see that a lot of the way that
these machine learning algorithms work
is that they depend upon
having a lot of data

English: 
from which to draw understanding
from and to try and analyze
in order to make predictions or
draw conclusions, for example.
DAVID MALAN: So, then,
is it fair to say,
because I have more familiarity
myself with networking and hardware
and so forth that because we just have
so much more disk space available to us
now and such higher CPU rates at which
machines can operate that that's partly
what's driven this that we now
have the computational abilities
to answer these questions?
BRIAN YU: Yeah, absolutely.
I would say that's a
big contributing factor.
DAVID MALAN: So if we
go down that road, like,
at what point are the algorithms
really getting fundamentally
smarter or better, as opposed to
the computers just getting so darn
fast that they can just
think so many steps ahead
and just come up with a compelling
answer to some current problem quicker
than, say, a human?
BRIAN YU: Yeah, it's a good question.
And the algorithms that we have
right now tend to be pretty good.
But there's a lot of
research that's happening
in machine learning
right now about like,
trying to make these algorithms better.
Right now, they're pretty accurate.
Can we make them even more accurate,
given the same amount of data?
Or even given less data--
can we make our algorithms

English: 
from which to draw understanding
from and to try and analyze
in order to make predictions or
draw conclusions, for example.
DAVID MALAN: So, then,
is it fair to say,
because I have more familiarity
myself with networking and hardware
and so forth that because we just have
so much more disk space available to us
now and such higher CPU rates at which
machines can operate that that's partly
what's driven this that we now
have the computational abilities
to answer these questions?
BRIAN YU: Yeah, absolutely.
I would say that's a
big contributing factor.
DAVID MALAN: So if we
go down that road, like,
at what point are the algorithms
really getting fundamentally
smarter or better, as opposed to
the computers just getting so darn
fast that they can just
think so many steps ahead
and just come up with a compelling
answer to some current problem quicker
than, say, a human?
BRIAN YU: Yeah, it's a good question.
And the algorithms that we have
right now tend to be pretty good.
But there's a lot of
research that's happening
in machine learning
right now about like,
trying to make these algorithms better.
Right now, they're pretty accurate.
Can we make them even more accurate,
given the same amount of data?
Or even given less data--
can we make our algorithms

English: 
able to be able to perform tasks
effectively just as effectively?
DAVID MALAN: OK, all right.
Well, so I feel like the
type of AI or machine
learning that I grew up with
or knew about or heard about
was always related to, like, games.
Like, chess was a big one.
I knew Google made a big splash
with Go some years ago-- the game,
not the language-- and then
video games more generally.
Like, if you ever wanted to play
back in the '80s against the "CPU,"
quote, unquote, I'm pretty sure it
was mostly just random at the time.
But there's certainly
been some games that
are ever more sophisticated
where it's actually
really hard to beat the computer or
really easy to beat the computer,
depending on the settings you choose.
So how are those kinds
of games implemented when
there's a computer playing the human?
BRIAN YU: Yeah, so this an
area, a very development
in the last couple of decades that 30
years ago was unimaginable probably
that a computer could beat a
human at chess, for example.
But now, the best computers can
easily beat the best humans.
No question about it.
And one of the ways that you do this
is via form of machine learning known
as reinforcement learning.

English: 
able to be able to perform tasks
effectively just as effectively?
DAVID MALAN: OK, all right.
Well, so I feel like the
type of AI or machine
learning that I grew up with
or knew about or heard about
was always related to, like, games.
Like, chess was a big one.
I knew Google made a big splash
with Go some years ago-- the game,
not the language-- and then
video games more generally.
Like, if you ever wanted to play
back in the '80s against the "CPU,"
quote, unquote, I'm pretty sure it
was mostly just random at the time.
But there's certainly
been some games that
are ever more sophisticated
where it's actually
really hard to beat the computer or
really easy to beat the computer,
depending on the settings you choose.
So how are those kinds
of games implemented when
there's a computer playing the human?
BRIAN YU: Yeah, so this an
area, a very development
in the last couple of decades that 30
years ago was unimaginable probably
that a computer could beat a
human at chess, for example.
But now, the best computers can
easily beat the best humans.
No question about it.
And one of the ways that you do this
is via form of machine learning known
as reinforcement learning.

English: 
And the idea of this is just letting
a computer learn from experience.
So if you want to train a
computer to be good at chess,
you could try and give
it instructions about you
thinking of strategies yourself as
the human and telling the computer.
But then the computer can only
ever be as good as you are.
But in reinforcement
learning, what we do is,
you let the computer play
a bunch of chess games.
And when the computer loses, it's
able to learn from that experience,
figure out what to do and then in
the future, know to do less of that.
And if the computer wins, then whatever
it did to get to that position,
it can do more of that.
And so you imagine just having
a computer play millions
and millions and millions of games.
And eventually, it starts to build
up this intelligence, so to speak,
of knowing what worked
and what didn't work.
And so in the future of being
able to get better and a better
at playing this game.
DAVID MALAN: So is this all that
different from even the human
and the animal world
where, like, if humans
have tried to domesticate animals
or pets where you sort of reinforce
good behavior positively and negatively
reinforce, like, bad behavior?
I mean, is that essentially what
we're doing with our computers?
BRIAN YU: Yeah, it's
inspired by the same idea.

English: 
And the idea of this is just letting
a computer learn from experience.
So if you want to train a
computer to be good at chess,
you could try and give
it instructions about you
thinking of strategies yourself as
the human and telling the computer.
But then the computer can only
ever be as good as you are.
But in reinforcement
learning, what we do is,
you let the computer play
a bunch of chess games.
And when the computer loses, it's
able to learn from that experience,
figure out what to do and then in
the future, know to do less of that.
And if the computer wins, then whatever
it did to get to that position,
it can do more of that.
And so you imagine just having
a computer play millions
and millions and millions of games.
And eventually, it starts to build
up this intelligence, so to speak,
of knowing what worked
and what didn't work.
And so in the future of being
able to get better and a better
at playing this game.
DAVID MALAN: So is this all that
different from even the human
and the animal world
where, like, if humans
have tried to domesticate animals
or pets where you sort of reinforce
good behavior positively and negatively
reinforce, like, bad behavior?
I mean, is that essentially what
we're doing with our computers?
BRIAN YU: Yeah, it's
inspired by the same idea.

English: 
And when a computer does something
right or does something in the works,
you give the computer a reward, so to
speak, is what people actually call it.
And then there's the penalty if the
computer isn't able to perform as well.
And so you just train the computer
algorithm to maximize that reward,
whether that reward is the result of
like winning a game of chess or a robot
being able to move a
certain number of paces.
And the result is that
with enough training,
you end up with a computer that
can actually perform the task.
DAVID MALAN: Fascinating.
So I feel like another
buzzword these days is, like,
smart city where somehow, cities
are using computer science
and using software more sophisticatedly.
And I gather that you can even
use this kind of reinforcement
learning for, like, traffic
lights, even in our human world?
BRIAN YU: Yeah.
So traffic lights traditionally
are just controlled by a timer
that after a certain number of
seconds, the traffic light switches.
But recently, there's been growth in,
like, AI-controlled traffic lights
where you have traffic lights that
are connected to radar and cameras.
And that can actually
see, like, when the cars
are approaching in different places--
what times of day they tend to approach.

English: 
And when a computer does something
right or does something in the works,
you give the computer a reward, so to
speak, is what people actually call it.
And then there's the penalty if the
computer isn't able to perform as well.
And so you just train the computer
algorithm to maximize that reward,
whether that reward is the result of
like winning a game of chess or a robot
being able to move a
certain number of paces.
And the result is that
with enough training,
you end up with a computer that
can actually perform the task.
DAVID MALAN: Fascinating.
So I feel like another
buzzword these days is, like,
smart city where somehow, cities
are using computer science
and using software more sophisticatedly.
And I gather that you can even
use this kind of reinforcement
learning for, like, traffic
lights, even in our human world?
BRIAN YU: Yeah.
So traffic lights traditionally
are just controlled by a timer
that after a certain number of
seconds, the traffic light switches.
But recently, there's been growth in,
like, AI-controlled traffic lights
where you have traffic lights that
are connected to radar and cameras.
And that can actually
see, like, when the cars
are approaching in different places--
what times of day they tend to approach.

English: 
And so you can begin to,
like, train an AI traffic
light to be able to predict,
all right, when should I
be switching lights and maybe even
having traffic lights coordinated
across multiple intersections
across the city to try
and figure out what's the best
way to flip the lights in order
to make sure that people are able to
get through those intersections quickly.
DAVID MALAN: So that's
pretty compelling,
because I'm definitely in
Cambridge, been, like, in a car
and stopped at a traffic light.
And there's, like, no one around.
And you wish it would just
notice either via sensor or timer
or whatever that, like, this is
clearly not the most efficient use of,
like, anyone's time.
So that's pretty amazing that it could
adapt sort of seamlessly like that.
Though, what is the
relationship between AI
and the buttons that the
humans pushed across the street
that according to
various things I've read
are actually placebos and don't
actually do anything and in some cases,
aren't even connected to wires.
BRIAN YU: I'm not actually sure.
I've also heard that
they may be placebos.
I've also heard that, like, the
elevator close button is also
a placebo that you press that.
And it sometimes doesn't actually work.
DAVID MALAN: Yes, I've read
it even, which not necessarily
authoritative source.
There is, like, a photo
where someone showed

English: 
And so you can begin to,
like, train an AI traffic
light to be able to predict,
all right, when should I
be switching lights and maybe even
having traffic lights coordinated
across multiple intersections
across the city to try
and figure out what's the best
way to flip the lights in order
to make sure that people are able to
get through those intersections quickly.
DAVID MALAN: So that's
pretty compelling,
because I'm definitely in
Cambridge, been, like, in a car
and stopped at a traffic light.
And there's, like, no one around.
And you wish it would just
notice either via sensor or timer
or whatever that, like, this is
clearly not the most efficient use of,
like, anyone's time.
So that's pretty amazing that it could
adapt sort of seamlessly like that.
Though, what is the
relationship between AI
and the buttons that the
humans pushed across the street
that according to
various things I've read
are actually placebos and don't
actually do anything and in some cases,
aren't even connected to wires.
BRIAN YU: I'm not actually sure.
I've also heard that
they may be placebos.
I've also heard that, like, the
elevator close button is also
a placebo that you press that.
And it sometimes doesn't actually work.
DAVID MALAN: Yes, I've read
it even, which not necessarily
authoritative source.
There is, like, a photo
where someone showed

English: 
a door close button had fallen off.
But there was nothing behind it.
Now, could have been photoshop.
But I think there's evidence
of this, nonetheless.
BRIAN YU: It might be the case.
I don't think there's
any AI happening there.
But I think it's more just psychology
of the people and trying to make people
feel better by giving
them a button to press.
DAVID MALAN: Do you push the button
when you run across the street?
BRIAN YU: I do usually push the button
when I want to cross the street.
DAVID MALAN: This is such a big scam,
though, on all of us it would seem.
BRIAN YU: Do not push the button?
DAVID MALAN: No, I do,
because just, what if?
And actually it's so
gratifying, because there's
a couple places in
Cambridge, Massachusetts
where the button legitimately works.
When you want to cross the
street, you hit the button.
Within half a second, it
has changed the light.
It's the most, like,
empowering feeling in the world
because that never happens.
Even in an elevator, half the time
you push it, like, nothing happens,
or eventually it does and is
very good positive reinforcement
to see the traffic lights changing.
I'm very well behaved-- the
traffic lights as a result.
OK, so more recently,
I feel like, computers
have gotten way better at some
technologies that kind of sort of
existed when I was a kid,
like, handwriting recognition.

English: 
a door close button had fallen off.
But there was nothing behind it.
Now, could have been photoshop.
But I think there's evidence
of this, nonetheless.
BRIAN YU: It might be the case.
I don't think there's
any AI happening there.
But I think it's more just psychology
of the people and trying to make people
feel better by giving
them a button to press.
DAVID MALAN: Do you push the button
when you run across the street?
BRIAN YU: I do usually push the button
when I want to cross the street.
DAVID MALAN: This is such a big scam,
though, on all of us it would seem.
BRIAN YU: Do not push the button?
DAVID MALAN: No, I do,
because just, what if?
And actually it's so
gratifying, because there's
a couple places in
Cambridge, Massachusetts
where the button legitimately works.
When you want to cross the
street, you hit the button.
Within half a second, it
has changed the light.
It's the most, like,
empowering feeling in the world
because that never happens.
Even in an elevator, half the time
you push it, like, nothing happens,
or eventually it does and is
very good positive reinforcement
to see the traffic lights changing.
I'm very well behaved-- the
traffic lights as a result.
OK, so more recently,
I feel like, computers
have gotten way better at some
technologies that kind of sort of
existed when I was a kid,
like, handwriting recognition.

English: 
There was the palm
pilot early on, which is
like a popular PDA or personal digital
assistant, which has now been replaced
with Androids and iPhones and so forth.
But handwriting recognition is a
biggie for machine learning, right?
BRIAN YU: Yeah, definitely.
And this is an area that's
gotten very, very good.
I mean, I recently have
just started using an iPad.
And it's amazing that I can
be taking handwritten notes.
But then my app will let me,
like, search for them by text
that it will look at my
handwriting, convert it to text
so that I can search through it all.
It's very, very powerful.
And the way that this
is often working now
is just by having
access to a lot of data.
So, for example, if you
wanted to train a computer
to be able to recognize handwritten
digits, like, digits on a check
that you could deposit
virtually now, like,
my banking app can
deposit checks digitally.
What you can do is give the
machine learning algorithm
a whole bunch of data, basically
a whole bunch of pictures
of handwritten numbers
that people have drawn
and labels for them associated
with what number it actually is.
And so the computer can learn from
a whole bunch of examples of here

English: 
There was the palm
pilot early on, which is
like a popular PDA or personal digital
assistant, which has now been replaced
with Androids and iPhones and so forth.
But handwriting recognition is a
biggie for machine learning, right?
BRIAN YU: Yeah, definitely.
And this is an area that's
gotten very, very good.
I mean, I recently have
just started using an iPad.
And it's amazing that I can
be taking handwritten notes.
But then my app will let me,
like, search for them by text
that it will look at my
handwriting, convert it to text
so that I can search through it all.
It's very, very powerful.
And the way that this
is often working now
is just by having
access to a lot of data.
So, for example, if you
wanted to train a computer
to be able to recognize handwritten
digits, like, digits on a check
that you could deposit
virtually now, like,
my banking app can
deposit checks digitally.
What you can do is give the
machine learning algorithm
a whole bunch of data, basically
a whole bunch of pictures
of handwritten numbers
that people have drawn
and labels for them associated
with what number it actually is.
And so the computer can learn from
a whole bunch of examples of here

English: 
are some handwritten ones, and
here are some handwritten twos,
and here's some handwritten threes.
And so when a new handwritten
digit comes along,
the computer just learns from
that previous data and says,
does this look kind of like the ones,
or does it look more like the twos?
And it can make an assessment
as a result of that.
DAVID MALAN: So how can
we humans are sometimes
filling out those little captchas--
the little challenges on websites
where they're asking us, the humans,
to tell them what something says?
BRIAN YU: Yeah.
Part of the ideas that the captchas
are trying to prove to the computer
that you are, in fact, human.
They're asking you to
prove that you're a human.
And so they're trying to give you a
task that a computer might struggle
to do, for instance, like, identify
which of these images happened to have,
like, traffic lights
in them, for example.
Although nowadays, computers
are getting pretty good at that
that they using machine
learning techniques
they can tell which of
them are traffic lights.
DAVID MALAN: Yeah, exactly.
I would think so.
BRIAN YU: And I've also
heard people talk about it.
I don't don't, action,
if this is true that you
can use the results of these captchas
to actually train machine learning

English: 
are some handwritten ones, and
here are some handwritten twos,
and here's some handwritten threes.
And so when a new handwritten
digit comes along,
the computer just learns from
that previous data and says,
does this look kind of like the ones,
or does it look more like the twos?
And it can make an assessment
as a result of that.
DAVID MALAN: So how can
we humans are sometimes
filling out those little captchas--
the little challenges on websites
where they're asking us, the humans,
to tell them what something says?
BRIAN YU: Yeah.
Part of the ideas that the captchas
are trying to prove to the computer
that you are, in fact, human.
They're asking you to
prove that you're a human.
And so they're trying to give you a
task that a computer might struggle
to do, for instance, like, identify
which of these images happened to have,
like, traffic lights
in them, for example.
Although nowadays, computers
are getting pretty good at that
that they using machine
learning techniques
they can tell which of
them are traffic lights.
DAVID MALAN: Yeah, exactly.
I would think so.
BRIAN YU: And I've also
heard people talk about it.
I don't don't, action,
if this is true that you
can use the results of these captchas
to actually train machine learning

English: 
algorithms that when you are
choosing which of the images
have traffic lights in them,
you're training the algorithms that
are powering, like,
self-driving cars, for instance,
to be able to better assess whether
there are traffic lights in an image,
because you're giving
more and more of this data
that computers are able to draw from.
So we've heard that too.
DAVID MALAN: It's interesting
how these algorithms
are so similar to presumably
how humans work, because,
like, when you and I learned how to
write text, whether it was in print
or cursive, like, the teacher just shows
us, like, one canonical letter A or B
or C. And yet, obviously,
like, every kid in the room
is probably drawing that A or B
or C a little bit differently.
And yet, somehow, we humans just kind
of know that that's close enough.
So is it fair to say, like, computers
really are just kind of doing that?
They are just being
taught what something is
and then tolerating variations, thereof?
BRIAN YU: Yeah, that's
probably about it.
One of the inspirations
for machine learning
really is that the types of
things that computers are good at
and the types of things
that people are good at
tend to be very, very different.

English: 
algorithms that when you are
choosing which of the images
have traffic lights in them,
you're training the algorithms that
are powering, like,
self-driving cars, for instance,
to be able to better assess whether
there are traffic lights in an image,
because you're giving
more and more of this data
that computers are able to draw from.
So we've heard that too.
DAVID MALAN: It's interesting
how these algorithms
are so similar to presumably
how humans work, because,
like, when you and I learned how to
write text, whether it was in print
or cursive, like, the teacher just shows
us, like, one canonical letter A or B
or C. And yet, obviously,
like, every kid in the room
is probably drawing that A or B
or C a little bit differently.
And yet, somehow, we humans just kind
of know that that's close enough.
So is it fair to say, like, computers
really are just kind of doing that?
They are just being
taught what something is
and then tolerating variations, thereof?
BRIAN YU: Yeah, that's
probably about it.
One of the inspirations
for machine learning
really is that the types of
things that computers are good at
and the types of things
that people are good at
tend to be very, very different.

English: 
But, like, computers can very easily
do complex calculations, no problem,
when we might struggle with it.
But a problem, like,
identifying that in a picture,
is there a bird in the
sky or not, for example?
That's something that for a long time,
computers really struggled to do,
whereas, it's easy for a child to be
able to look in the sky and tell you
if there's a bird there.
DAVID MALAN: Oh, I was just going
to say, I could do that probably.
OK.
So if this is supervised learning,
and handwriting recognition's one,
like, what other types of
applications fall under this umbrella?
BRIAN YU: Yeah, so
handwriting recognition
counts as supervised learning,
because it's supervised in the sense
that when we're providing
data to the algorithm,
like the handwritten numbers, we're also
providing labels for that data, like,
saying, this is the number
one-- this is the number two.
That way, the computer is
able to learn from that.
But this shows up all over the place.
So, for instance, like,
your email spam filter
that detects automatically
which emails are spam
and puts them in the spam mailbox,
it's trained the same way.
You basically give the computer a
whole bunch of emails-- some of which
you tell the computer these are
real emails that are good emails.
And here, these are some other
emails that are spam emails.

English: 
But, like, computers can very easily
do complex calculations, no problem,
when we might struggle with it.
But a problem, like,
identifying that in a picture,
is there a bird in the
sky or not, for example?
That's something that for a long time,
computers really struggled to do,
whereas, it's easy for a child to be
able to look in the sky and tell you
if there's a bird there.
DAVID MALAN: Oh, I was just going
to say, I could do that probably.
OK.
So if this is supervised learning,
and handwriting recognition's one,
like, what other types of
applications fall under this umbrella?
BRIAN YU: Yeah, so
handwriting recognition
counts as supervised learning,
because it's supervised in the sense
that when we're providing
data to the algorithm,
like the handwritten numbers, we're also
providing labels for that data, like,
saying, this is the number
one-- this is the number two.
That way, the computer is
able to learn from that.
But this shows up all over the place.
So, for instance, like,
your email spam filter
that detects automatically
which emails are spam
and puts them in the spam mailbox,
it's trained the same way.
You basically give the computer a
whole bunch of emails-- some of which
you tell the computer these are
real emails that are good emails.
And here, these are some other
emails that are spam emails.

English: 
And the computer tries to learn the
characteristics and the traits of spam
email so that when a
new email comes about,
the computer is able to
make a judgment call about,
do I think this is a nonspam,
or do I think it's a spam email?
And so you could get it to
classify it in that way.
So this kind of classification
problem is a big area and supervised.
DAVID MALAN: And is that what
is happening if you use gmail,
and you click on an email
and report it as spam,
like, you're training gmail to
get better at distinguishing?
BRIAN YU: Yes.
You can think of that as a form of
reinforcement learning of the computer
learning from experience.
DAVID MALAN: Good boy.
BRIAN YU: You tell the
computer that it got it wrong.
And it's now going to try and
learn to be better in the future
to be able to more accurately
predict which emails are spam
or not spam based on what you tell it.
And gmail has so many
users and so many emails
that are coming to the inbox every
day that you do this enough times.
And the algorithm gets pretty good at
figuring out whether an email's spam
or not.
DAVID MALAN: It's a little creepy that
my inbox is becoming sentient somehow.
OK, so if there's supervised
learning, I presume
there's also unsupervised learning.
Is there?
BRIAN YU: Yeah, there absolutely is.
So supervised learning
requires labels on the data.

English: 
And the computer tries to learn the
characteristics and the traits of spam
email so that when a
new email comes about,
the computer is able to
make a judgment call about,
do I think this is a nonspam,
or do I think it's a spam email?
And so you could get it to
classify it in that way.
So this kind of classification
problem is a big area and supervised.
DAVID MALAN: And is that what
is happening if you use gmail,
and you click on an email
and report it as spam,
like, you're training gmail to
get better at distinguishing?
BRIAN YU: Yes.
You can think of that as a form of
reinforcement learning of the computer
learning from experience.
DAVID MALAN: Good boy.
BRIAN YU: You tell the
computer that it got it wrong.
And it's now going to try and
learn to be better in the future
to be able to more accurately
predict which emails are spam
or not spam based on what you tell it.
And gmail has so many
users and so many emails
that are coming to the inbox every
day that you do this enough times.
And the algorithm gets pretty good at
figuring out whether an email's spam
or not.
DAVID MALAN: It's a little creepy that
my inbox is becoming sentient somehow.
OK, so if there's supervised
learning, I presume
there's also unsupervised learning.
Is there?
BRIAN YU: Yeah, there absolutely is.
So supervised learning
requires labels on the data.

English: 
But sometimes, their data
doesn't always have labels.
But you still want to be able to take
a data set, give it to a computer
and get the computer to tell you
something interesting about it.
And so one common example of this is
for when you're doing consumer analysis,
like, when Amazon is trying to
understand its customers, for instance,
Amazon might not know all the
different categories of customers
that there might be.
So it might not be able to
give them labels already.
But you could feed a whole bunch
of customer data to an algorithm.
And the algorithm could group customers
into similar groups, potentially,
based on the types of products
they're likely to buy, for example.
And you might not know in
advance how many groups there are
or even what the groups are.
But the algorithm can get
pretty good at clustering people
into different groups.
So clustering is a big example
of unsupervised learning.
That's pretty common.
DAVID MALAN: So how different is that
from just, like, exhaustive search
if you sort of label every customer
with certain attributes-- what they've
bought, what time they've bought
it, how frequently they've bought it
and so forth?
Like, isn't this really just
some kind of quadratic problem
where you compare every customer's
habits against every other customers

English: 
But sometimes, their data
doesn't always have labels.
But you still want to be able to take
a data set, give it to a computer
and get the computer to tell you
something interesting about it.
And so one common example of this is
for when you're doing consumer analysis,
like, when Amazon is trying to
understand its customers, for instance,
Amazon might not know all the
different categories of customers
that there might be.
So it might not be able to
give them labels already.
But you could feed a whole bunch
of customer data to an algorithm.
And the algorithm could group customers
into similar groups, potentially,
based on the types of products
they're likely to buy, for example.
And you might not know in
advance how many groups there are
or even what the groups are.
But the algorithm can get
pretty good at clustering people
into different groups.
So clustering is a big example
of unsupervised learning.
That's pretty common.
DAVID MALAN: So how different is that
from just, like, exhaustive search
if you sort of label every customer
with certain attributes-- what they've
bought, what time they've bought
it, how frequently they've bought it
and so forth?
Like, isn't this really just
some kind of quadratic problem
where you compare every customer's
habits against every other customers

English: 
habits, and you can,
therefore, exhaustively
figure out what the commonalities are?
Like, why is this so intelligent?
BRIAN YU: So you could
come up with an algorithm
to say, like, OK, how close together are
two particular customers, for instance,
in terms of how many things that
they've bought in common, for instance,
or when they're buying
particular products?
But if you've got a
lot of different users
that all have slightly different habits,
and maybe some groups of people share
things in common with other groups but
then don't share other characteristics
in common, it can be tricky to be
able to group an entire user base
into a whole bunch of different
clusters that are meaningful.
And so the unsupervised
learning algorithms
are pretty good at trying to
figure out how you would actually
cluster those people.
DAVID MALAN: Interesting.
OK.
And so this is true for things I know in
radiology, especially these days, like,
computers can actually not only
read film, so x-rays and other types
of images of human bodies.
They can actually identify
things, like, tumors now,
without necessarily knowing what
kind of tumor they're looking for.
BRIAN YU: Yeah.
So one application of unsupervised
learning is, like a anomaly detection,

English: 
habits, and you can,
therefore, exhaustively
figure out what the commonalities are?
Like, why is this so intelligent?
BRIAN YU: So you could
come up with an algorithm
to say, like, OK, how close together are
two particular customers, for instance,
in terms of how many things that
they've bought in common, for instance,
or when they're buying
particular products?
But if you've got a
lot of different users
that all have slightly different habits,
and maybe some groups of people share
things in common with other groups but
then don't share other characteristics
in common, it can be tricky to be
able to group an entire user base
into a whole bunch of different
clusters that are meaningful.
And so the unsupervised
learning algorithms
are pretty good at trying to
figure out how you would actually
cluster those people.
DAVID MALAN: Interesting.
OK.
And so this is true for things I know in
radiology, especially these days, like,
computers can actually not only
read film, so x-rays and other types
of images of human bodies.
They can actually identify
things, like, tumors now,
without necessarily knowing what
kind of tumor they're looking for.
BRIAN YU: Yeah.
So one application of unsupervised
learning is, like a anomaly detection,

English: 
given a set of data, which
things stand out as anomalous.
And so that has a lot
of medical applications
where if you've got a whole bunch of
medical scans or images, for instance,
you could have a computer
just look at all that data
and try and figure out which are the
ones that don't quite look right.
And that might be worth
doctors taking another look,
because potentially, there
might be a health concern there.
You see the exact same type of
technology and finance a lot
when you're trying to detect,
like, which transactions might be
fraudulent transactions, for instance.
Out of tons of transactions,
can you find the anomalies?
The things that sort of stand
out is not quite like the others.
And these unsupervised
learning algorithms
can be pretty good at picking out
those anomalies out of a data set.
DAVID MALAN: So what kind of
algorithm triggers a fraud alert?
Almost every time, I tried to
use my credit cards for work.
BRIAN YU: That one is, I don't really
know what's going on with that.
I know the credit card will often
trigger an alert if you're outside
of an area where you normally are.
But the details of how those
algorithms are working--
I couldn't really tell you.
DAVID MALAN: Interesting.
Common frustration when we
do travel here for work.

English: 
given a set of data, which
things stand out as anomalous.
And so that has a lot
of medical applications
where if you've got a whole bunch of
medical scans or images, for instance,
you could have a computer
just look at all that data
and try and figure out which are the
ones that don't quite look right.
And that might be worth
doctors taking another look,
because potentially, there
might be a health concern there.
You see the exact same type of
technology and finance a lot
when you're trying to detect,
like, which transactions might be
fraudulent transactions, for instance.
Out of tons of transactions,
can you find the anomalies?
The things that sort of stand
out is not quite like the others.
And these unsupervised
learning algorithms
can be pretty good at picking out
those anomalies out of a data set.
DAVID MALAN: So what kind of
algorithm triggers a fraud alert?
Almost every time, I tried to
use my credit cards for work.
BRIAN YU: That one is, I don't really
know what's going on with that.
I know the credit card will often
trigger an alert if you're outside
of an area where you normally are.
But the details of how those
algorithms are working--
I couldn't really tell you.
DAVID MALAN: Interesting.
Common frustration when we
do travel here for work.

English: 
OK, so it's funny, as you
described unsupervised learning,
it occurs to me that,
like, 10, 15 years ago when
I was actually doing my
dissertation work for my PhD, which
was, long story short, about
security and specifically,
how you could with software detect
sudden outbreaks of internet worms,
so malicious software that can
spread from one computer to another.
The approach we took at the
time was to actually look
at the system calls--
the low-level functions
that software was
executing on Windows PCs
and look for common patterns of
those system calls across systems.
And it only occurs to me, like, all
these years later that arguably,
what we were doing in
our team to do this
was really a form of machine learning.
I just think it wasn't very
buzz worthy at the time
to say what we were doing
was machine learning.
But I kind of think I know
machine learning in retrospect.
BRIAN YU: Yeah, maybe.
I mean, it's become so common nowadays
just to take anything and just
tack on machine learning to it to
make it sound fancier or sound cooler
than it actually is.
DAVID MALAN: Yes.
That, and I gather statistics is now
called data science, essentially,

English: 
OK, so it's funny, as you
described unsupervised learning,
it occurs to me that,
like, 10, 15 years ago when
I was actually doing my
dissertation work for my PhD, which
was, long story short, about
security and specifically,
how you could with software detect
sudden outbreaks of internet worms,
so malicious software that can
spread from one computer to another.
The approach we took at the
time was to actually look
at the system calls--
the low-level functions
that software was
executing on Windows PCs
and look for common patterns of
those system calls across systems.
And it only occurs to me, like, all
these years later that arguably,
what we were doing in
our team to do this
was really a form of machine learning.
I just think it wasn't very
buzz worthy at the time
to say what we were doing
was machine learning.
But I kind of think I know
machine learning in retrospect.
BRIAN YU: Yeah, maybe.
I mean, it's become so common nowadays
just to take anything and just
tack on machine learning to it to
make it sound fancier or sound cooler
than it actually is.
DAVID MALAN: Yes.
That, and I gather statistics is now
called data science, essentially,

English: 
perhaps, overstating though.
So certainly all the rage,
though, speaking of trends
is, like, self-driving cars.
In fact, if I can cite another
authoritative Reddit photo--
and this one I think actually
made the national news.
What is it about AI that's suddenly
enabling people to literally sleep
behind the wheel of a car?
BRIAN YU: Well, I don't think people
should be doing that quite yet but--
DAVID MALAN: But you do eventually.
BRIAN YU: Well.
So self-driving technology is
hopefully going to get better.
But right now, we're in sort
of a dangerous middle ground
that cars are able to do more
and more things autonomously.
They can change lanes on their own.
They can maintain their
lane on their own.
They can parallel park by
themselves, for example.
The consumer ones, at least,
are certainly not at the place
where you could just ignore
the wheel entirely and just
let them go on their own.
But a lot of people are almost
treating cars as if they can do that.
And so it's a dangerous time, certainly,
for these semi-autonomous vehicles.
DAVID MALAN: And it's funny
you mentioned parallel parking.

English: 
perhaps, overstating though.
So certainly all the rage,
though, speaking of trends
is, like, self-driving cars.
In fact, if I can cite another
authoritative Reddit photo--
and this one I think actually
made the national news.
What is it about AI that's suddenly
enabling people to literally sleep
behind the wheel of a car?
BRIAN YU: Well, I don't think people
should be doing that quite yet but--
DAVID MALAN: But you do eventually.
BRIAN YU: Well.
So self-driving technology is
hopefully going to get better.
But right now, we're in sort
of a dangerous middle ground
that cars are able to do more
and more things autonomously.
They can change lanes on their own.
They can maintain their
lane on their own.
They can parallel park by
themselves, for example.
The consumer ones, at least,
are certainly not at the place
where you could just ignore
the wheel entirely and just
let them go on their own.
But a lot of people are almost
treating cars as if they can do that.
And so it's a dangerous time, certainly,
for these semi-autonomous vehicles.
DAVID MALAN: And it's funny
you mentioned parallel parking.

English: 
In a contest between you and a computer,
who could parallel park better,
do you think?
BRIAN YU: The computer would
definitely beat me at parallel parking.
So I got my driver's
license in California.
And learning to parallel park is
not on the California driving test.
So I was not tested on it.
I've done it maybe a couple times
with the assistance of my parents
but definitely not something
I feel very comfortable doing.
DAVID MALAN: But I feel like when I
go to California and San Francisco
in very hilly cities, it's
certainly common to park diagonally
against the curb so not parallel, per
se, partly just for the physics of it
so that there's less risk of cars
presumably rolling down the hill.
But I feel like in other
flatter areas of California,
I have absolutely when
traveling, parallel park.
So, like, how is this not a thing?
[LAUGHS] I mean, it's definitely common.
People do parallel park.
It's just not required on the test.
And so people invariably
learn when they need to.
But pretty soon after I
got my driver's license,
I ended up moving across the country
to Massachusetts for college.
And so once I got to
college, I never really had
occasion to drive a whole lot.
So I just never really
did a lot of driving.
DAVID MALAN: I will say,
I've gotten very comfortable

English: 
In a contest between you and a computer,
who could parallel park better,
do you think?
BRIAN YU: The computer would
definitely beat me at parallel parking.
So I got my driver's
license in California.
And learning to parallel park is
not on the California driving test.
So I was not tested on it.
I've done it maybe a couple times
with the assistance of my parents
but definitely not something
I feel very comfortable doing.
DAVID MALAN: But I feel like when I
go to California and San Francisco
in very hilly cities, it's
certainly common to park diagonally
against the curb so not parallel, per
se, partly just for the physics of it
so that there's less risk of cars
presumably rolling down the hill.
But I feel like in other
flatter areas of California,
I have absolutely when
traveling, parallel park.
So, like, how is this not a thing?
[LAUGHS] I mean, it's definitely common.
People do parallel park.
It's just not required on the test.
And so people invariably
learn when they need to.
But pretty soon after I
got my driver's license,
I ended up moving across the country
to Massachusetts for college.
And so once I got to
college, I never really had
occasion to drive a whole lot.
So I just never really
did a lot of driving.
DAVID MALAN: I will say,
I've gotten very comfortable

English: 
certainly over the
years, parallel parking,
when I'm parking on the
right-hand side of the road,
because, of course, in the
US, we drive on the left.
But it does throw me if
it's like a one-way street,
and I need to park on the left-hand
side, because all of my optics
are a little off.
So I can appreciate that.
So a self-driving car, like a Tesla, is
like the off cited example these days.
Like, what are the inputs to that
problem and like, the outputs,
the decisions that are being made by
the car just to make this more concrete?
BRIAN YU: Yeah.
So I guess the inputs are probably
at least two broad categories--
one input being all of the
sensory information around the car
that these cars have so many
sensors and cameras that
are trying to detect what
items and objects are around it
and trying to figure all of that out.
And the second input being presumably
a human-entered destination
where the user probably is typing into
some device on the computer in the car
where it is that they
actually want to go.
And the output, hopefully is
that the computers or the car

English: 
certainly over the
years, parallel parking,
when I'm parking on the
right-hand side of the road,
because, of course, in the
US, we drive on the left.
But it does throw me if
it's like a one-way street,
and I need to park on the left-hand
side, because all of my optics
are a little off.
So I can appreciate that.
So a self-driving car, like a Tesla, is
like the off cited example these days.
Like, what are the inputs to that
problem and like, the outputs,
the decisions that are being made by
the car just to make this more concrete?
BRIAN YU: Yeah.
So I guess the inputs are probably
at least two broad categories--
one input being all of the
sensory information around the car
that these cars have so many
sensors and cameras that
are trying to detect what
items and objects are around it
and trying to figure all of that out.
And the second input being presumably
a human-entered destination
where the user probably is typing into
some device on the computer in the car
where it is that they
actually want to go.
And the output, hopefully is
that the computers or the car

English: 
is able to make all of the decisions
about when to step on the gas,
when to turn the wheel,
and all of those actions
that it needs to take to get you
from point A to point B. I mean,
that's the goal of these technologies.
DAVID MALAN: Fascinating.
It would just really frighten
me to see someone on the road
not holding the wheel of the car.
This is maybe a little
more of a California thing.
Though, other states are
certainly experimenting with this.
Or companies in various states are.
So my car is old enough that I don't
so much have a screen in the car.
It's really just me and
a bunch of glass mirrors.
And it still blows my
mind in 2019 when I
get into a rental car or friend's
car that even just has the LCD
screen with a camera in the back that
shows you, like, the green, yellow,
and red markings.
And it beeps when you're
getting too close to the car.
So is that machine learning when it's
detecting something and beeping at you
when you're trying to
park, for instance?
Well, I guess you wouldn't know.
BRIAN YU: [LAUGHS] My guess is
that's probably not machine learning.
It's probably just a
pretty simple logic of,
like, try and detect what the
distance is via some sensor.
And if the distance is
less than a certain amount,

English: 
is able to make all of the decisions
about when to step on the gas,
when to turn the wheel,
and all of those actions
that it needs to take to get you
from point A to point B. I mean,
that's the goal of these technologies.
DAVID MALAN: Fascinating.
It would just really frighten
me to see someone on the road
not holding the wheel of the car.
This is maybe a little
more of a California thing.
Though, other states are
certainly experimenting with this.
Or companies in various states are.
So my car is old enough that I don't
so much have a screen in the car.
It's really just me and
a bunch of glass mirrors.
And it still blows my
mind in 2019 when I
get into a rental car or friend's
car that even just has the LCD
screen with a camera in the back that
shows you, like, the green, yellow,
and red markings.
And it beeps when you're
getting too close to the car.
So is that machine learning when it's
detecting something and beeping at you
when you're trying to
park, for instance?
Well, I guess you wouldn't know.
BRIAN YU: [LAUGHS] My guess is
that's probably not machine learning.
It's probably just a
pretty simple logic of,
like, try and detect what the
distance is via some sensor.
And if the distance is
less than a certain amount,

English: 
then, beep or something like that.
You could try and do it
using machine learning.
But probably, simple heuristics are
good enough for that type of thing--
would be my guess.
DAVID MALAN: So how should people
think about the line between software
just being ifs and
else ifs and conditions
and loops, versus,
like, machine learning,
which kind of takes things up a notch?
BRIAN YU: Yeah.
So I guess the line comes when it would
be difficult to formally articulate
exactly what the steps should be.
And driving is a complicated enough
task that trying to formally describe
exactly what the steps should be
for every particular circumstance
is going to be extraordinarily
difficult, if not impossible.
And so then you really
need to start to rely
on machine learning to be able
to answer questions, like,
is there a traffic light ahead of me?
And is the traffic light green or red?
And how many cars are ahead
of me, and where are they?
Because those are questions
that it's harder to just program
a definitive answer to
just given, like, all the
pixels of what the sensor of
the front of the car is seeing.
DAVID MALAN: So is that also true
with this other technology that's
in vogue these days of these
always listening devices,

English: 
then, beep or something like that.
You could try and do it
using machine learning.
But probably, simple heuristics are
good enough for that type of thing--
would be my guess.
DAVID MALAN: So how should people
think about the line between software
just being ifs and
else ifs and conditions
and loops, versus,
like, machine learning,
which kind of takes things up a notch?
BRIAN YU: Yeah.
So I guess the line comes when it would
be difficult to formally articulate
exactly what the steps should be.
And driving is a complicated enough
task that trying to formally describe
exactly what the steps should be
for every particular circumstance
is going to be extraordinarily
difficult, if not impossible.
And so then you really
need to start to rely
on machine learning to be able
to answer questions, like,
is there a traffic light ahead of me?
And is the traffic light green or red?
And how many cars are ahead
of me, and where are they?
Because those are questions
that it's harder to just program
a definitive answer to
just given, like, all the
pixels of what the sensor of
the front of the car is seeing.
DAVID MALAN: So is that also true
with this other technology that's
in vogue these days of these
always listening devices,

English: 
so like Siri and hey, Google and Alexa?
Like, I presume it's
relatively easy for companies
to support well-defined commands, so
a finite set of words or sentences
that the tools just to understand.
But does AI come into play or
machine learning coming into play
when you want to support
an infinite language,
like English or any
other spoken language?
BRIAN YU: Yes.
So certainly when it comes to
natural language processing,
given the words that I have spoken, can
you figure out what it is that I mean?
And that's a problem
that you'll often use
machine learning to be able to try and
get at some sense of the meaning for.
But even with those
predefined commands, if you
imagine a computer that only supported
a very limited number of fixed commands,
we're still giving those
commands via voice.
And so the computer still needs to
be able to translate the sounds that
are just being produced in the air
that the microphone is picking up
on into the actual words that they are.
And there's usually machine
learning involved there too,
because it's not simple to be
able to just take the sounds
and convert them to words,
because different people speak

English: 
so like Siri and hey, Google and Alexa?
Like, I presume it's
relatively easy for companies
to support well-defined commands, so
a finite set of words or sentences
that the tools just to understand.
But does AI come into play or
machine learning coming into play
when you want to support
an infinite language,
like English or any
other spoken language?
BRIAN YU: Yes.
So certainly when it comes to
natural language processing,
given the words that I have spoken, can
you figure out what it is that I mean?
And that's a problem
that you'll often use
machine learning to be able to try and
get at some sense of the meaning for.
But even with those
predefined commands, if you
imagine a computer that only supported
a very limited number of fixed commands,
we're still giving those
commands via voice.
And so the computer still needs to
be able to translate the sounds that
are just being produced in the air
that the microphone is picking up
on into the actual words that they are.
And there's usually machine
learning involved there too,
because it's not simple to be
able to just take the sounds
and convert them to words,
because different people speak

English: 
at different paces or have
slightly different accents
or will speak in
slightly different ways.
They might mispronounce something.
And so being able to train
a computer to listen to that
and figure out what the words
are, that can be tricky too.
DAVID MALAN: So that's pretty similar,
though, to handwriting recognition?
Is that fair to say?
BRIAN YU: Probably.
You could do it a similar way where
you train a computer by giving it
a whole bunch of sounds and what they
correspond to in getting the computer
to learn from all of that data.
DAVID MALAN: And so why is it
that every time I talk to Google,
it doesn't know what
song I wanted to play.
BRIAN YU: [LAUGHS] Well, this technology
is definitely still in progress.
There is definitely a lot of room
for these technologies to get better.
DAVID MALAN: Those are diplomatic.
BRIAN YU: [LAUGHS] I mean,
Siri on my phone half the time.
It doesn't pick up on exactly
what I'm trying to ask it.
DAVID MALAN: Oh, for me, it
feels even worse than that.
Like, I can confidently set timers,
like set timer for three minutes
if I'm boiling some water or something.
But I pretty much don't use it
for anything else besides that.
BRIAN YU: Yeah, I think timers I can do.
I used to try to, like, if I
needed to send, like, a quick text

English: 
at different paces or have
slightly different accents
or will speak in
slightly different ways.
They might mispronounce something.
And so being able to train
a computer to listen to that
and figure out what the words
are, that can be tricky too.
DAVID MALAN: So that's pretty similar,
though, to handwriting recognition?
Is that fair to say?
BRIAN YU: Probably.
You could do it a similar way where
you train a computer by giving it
a whole bunch of sounds and what they
correspond to in getting the computer
to learn from all of that data.
DAVID MALAN: And so why is it
that every time I talk to Google,
it doesn't know what
song I wanted to play.
BRIAN YU: [LAUGHS] Well, this technology
is definitely still in progress.
There is definitely a lot of room
for these technologies to get better.
DAVID MALAN: Those are diplomatic.
BRIAN YU: [LAUGHS] I mean,
Siri on my phone half the time.
It doesn't pick up on exactly
what I'm trying to ask it.
DAVID MALAN: Oh, for me, it
feels even worse than that.
Like, I can confidently set timers,
like set timer for three minutes
if I'm boiling some water or something.
But I pretty much don't use it
for anything else besides that.
BRIAN YU: Yeah, I think timers I can do.
I used to try to, like, if I
needed to send, like, a quick text

English: 
message to someone, I
used to try and say, like,
text my mom that I'll be at
the airport in 10 minutes.
But even then, it's very hit or miss.
DAVID MALAN: Well, even
with you the other day,
I sent you a text message verbally.
But I just let the audio
go out, because I just
have too little confidence in
the transcription capabilities
of these devices these days.
BRIAN YU: Yeah.
Like, the iPhone, now, we'll try to,
like, transcribe voicemails for you.
Or, at least, it'll make an
attempt to so that you can just
tap on the voicemail and see
a transcription of what's
contained in the voicemail.
And I really haven't
found it very helpful.
But it can get like a couple of words.
And maybe I'll get a general sense.
But it's not good enough for me to
really get any meaning out of it.
That's tough to listen to voicemail.
DAVID MALAN: See, I don't know.
I think that's actually use case where
it's useful enough usually for me
if I can glean who it's from,
or what the gist of the messages
so then I don't have to actually
listen to it in real time.
But the problem for me when
sending outbound messages
is, I want to look like an idiot.
They were completely incoherent,
because Siri or whatever technology
is not transcribing me correctly.
OK.
But the dream I have, at
least-- one of my favorite books

English: 
message to someone, I
used to try and say, like,
text my mom that I'll be at
the airport in 10 minutes.
But even then, it's very hit or miss.
DAVID MALAN: Well, even
with you the other day,
I sent you a text message verbally.
But I just let the audio
go out, because I just
have too little confidence in
the transcription capabilities
of these devices these days.
BRIAN YU: Yeah.
Like, the iPhone, now, we'll try to,
like, transcribe voicemails for you.
Or, at least, it'll make an
attempt to so that you can just
tap on the voicemail and see
a transcription of what's
contained in the voicemail.
And I really haven't
found it very helpful.
But it can get like a couple of words.
And maybe I'll get a general sense.
But it's not good enough for me to
really get any meaning out of it.
That's tough to listen to voicemail.
DAVID MALAN: See, I don't know.
I think that's actually use case where
it's useful enough usually for me
if I can glean who it's from,
or what the gist of the messages
so then I don't have to actually
listen to it in real time.
But the problem for me when
sending outbound messages
is, I want to look like an idiot.
They were completely incoherent,
because Siri or whatever technology
is not transcribing me correctly.
OK.
But the dream I have, at
least-- one of my favorite books

English: 
ever was Douglas Adams'
Hitchhiker's Guide to the Galaxy,
where the most amazing
technology in that book
is called The Babel fish, where it's a
little fish that you put in your ear.
And it somehow translates
all spoken words
that you're hearing into your
own native language, essentially.
So how close are we to being able
to talk to another human being who
does not speak the same language but
seamlessly chat with that person?
BRIAN YU: I think we're
pretty far away from it.
I think Skype, I think, has
this feature or, at least,
it's a feature that they've been
developing where they can try
to approximate a real-time translation.
And I think I saw a video--
DAVID MALAN: I can't even talk
successfully to someone in English
on Skype.
BRIAN YU: Yeah.
So I think the demo is pretty good.
But I don't think it's, like,
commercially available yet.
But translation technology
has gotten better.
But it's certainly still not good.
One of my favorite types of YouTube
videos that I watch sometimes
are people that will, like, take a
song and their lyrics of the song
and translate it into another language
and translate back into English.
And the lyrics just
get totally messed up,

English: 
ever was Douglas Adams'
Hitchhiker's Guide to the Galaxy,
where the most amazing
technology in that book
is called The Babel fish, where it's a
little fish that you put in your ear.
And it somehow translates
all spoken words
that you're hearing into your
own native language, essentially.
So how close are we to being able
to talk to another human being who
does not speak the same language but
seamlessly chat with that person?
BRIAN YU: I think we're
pretty far away from it.
I think Skype, I think, has
this feature or, at least,
it's a feature that they've been
developing where they can try
to approximate a real-time translation.
And I think I saw a video--
DAVID MALAN: I can't even talk
successfully to someone in English
on Skype.
BRIAN YU: Yeah.
So I think the demo is pretty good.
But I don't think it's, like,
commercially available yet.
But translation technology
has gotten better.
But it's certainly still not good.
One of my favorite types of YouTube
videos that I watch sometimes
are people that will, like, take a
song and their lyrics of the song
and translate it into another language
and translate back into English.
And the lyrics just
get totally messed up,

English: 
because this translation technology
is, it can approximate meaning.
But it's certainly far from perfect.
DAVID MALAN: That's kind of like
playing operator in English where
you tell someone something, and
they tell someone something,
and they tell someone-- someone.
And by the time you
go around the circle,
it is not at all what
you originally said.
BRIAN YU: Yeah, I think I played
that game when I was younger.
I think we called it telephone.
We call it operator?
DAVID MALAN: Yeah.
No, actually we probably
called it telephone too.
Was there an operator involved?
Maybe you call operator
if you need a hint.
Or maybe--
BRIAN YU: I don't think you
got hints when I was playing.
DAVID MALAN: No, I think we had a
hint feature where you say, operator.
And maybe the person next to you
has to tell you again or something.
Maybe it's been a long time
since I played this too.
Fascinating.
Well, thank you so much
for explaining to me
and everyone out there a little
bit more about machine learning.
If folks want to learn more about ML,
what would you suggest they google?
BRIAN YU: Yeah.
So you can look up basically any of the
keywords that we talked about today.
You could just look up machine learning.
But if you wanted to
be more specific, you
could look up reinforcement learning
or supervised learning or unsupervised
learning.
If there's any of the
particular technologies,

English: 
because this translation technology
is, it can approximate meaning.
But it's certainly far from perfect.
DAVID MALAN: That's kind of like
playing operator in English where
you tell someone something, and
they tell someone something,
and they tell someone-- someone.
And by the time you
go around the circle,
it is not at all what
you originally said.
BRIAN YU: Yeah, I think I played
that game when I was younger.
I think we called it telephone.
We call it operator?
DAVID MALAN: Yeah.
No, actually we probably
called it telephone too.
Was there an operator involved?
Maybe you call operator
if you need a hint.
Or maybe--
BRIAN YU: I don't think you
got hints when I was playing.
DAVID MALAN: No, I think we had a
hint feature where you say, operator.
And maybe the person next to you
has to tell you again or something.
Maybe it's been a long time
since I played this too.
Fascinating.
Well, thank you so much
for explaining to me
and everyone out there a little
bit more about machine learning.
If folks want to learn more about ML,
what would you suggest they google?
BRIAN YU: Yeah.
So you can look up basically any of the
keywords that we talked about today.
You could just look up machine learning.
But if you wanted to
be more specific, you
could look up reinforcement learning
or supervised learning or unsupervised
learning.
If there's any of the
particular technologies,

English: 
you could look those up specifically,
like handwriting recognition
or self-driving cars.
There are a lot of resources
available of people
that are talking about
these technologies
and how they work for sure.
DAVID MALAN: Awesome.
Well, thanks so much.
This was Machine Learning
on the CS50 podcast.
If you have other ideas for topics that
you'd love for Brian and I and the team
to discuss and explore, do just drop us
an email at podcast@cs50.harvard.edu.
My name is David Malan.
BRIAN YU: I'm Brian Yu.
DAVID MALAN: And this
was the CS50 Podcast.

English: 
you could look those up specifically,
like handwriting recognition
or self-driving cars.
There are a lot of resources
available of people
that are talking about
these technologies
and how they work for sure.
DAVID MALAN: Awesome.
Well, thanks so much.
This was Machine Learning
on the CS50 podcast.
If you have other ideas for topics that
you'd love for Brian and I and the team
to discuss and explore, do just drop us
an email at podcast@cs50.harvard.edu.
My name is David Malan.
BRIAN YU: I'm Brian Yu.
DAVID MALAN: And this
was the CS50 Podcast.
