- All right, welcome, everyone.
First lecture of series
287, Advanced Robotics.
Let me start by introducing
the course staff,
at least, whoever is here.
I'm professor of the class, Pieter Abbeel.
I saw Ignasi.
Ignasi, where are you?
Over here, wanna stand up for a moment,
people can see you.
So Ignasi's one of the GSIs.
And Laura Smith in the back.
Do you wanna stand up for a
moment so people can see you?
That's Laura.
And Harry is over there in the back.
So those are the three GSIs,
and we look forward to a
exciting semester with you.
First important thing is there's
a webpage for the class,
and that's the URL.
And let's step through that right now
so you can see what's on there
and get a little bit of
an idea for the structure
of the course.
- [Student] Can you leave the link up
for a little while, maybe?
- Oh, okay, sure.
Here it is.
And then, ah, wrong thing.
So what do we have on the webpage?
We have the information
I already told you,
who is teaching the class, when it's held.
Then, I show also links to past offerings
if you wanna get an idea of
what we're gonna cover now.
There's gonna be a good amount of overlap,
so you can know ahead of time a little bit
what this is gonna be about.
Office hours will start next week.
We'll post them on here.
We'll also post them on Piazza.
You should sign up for
Piazza, because essentially,
all of our communication
that's not happening in
lecture will be through Piazza.
As you can see, our announcements.
We have two announcements here,
but the second one's already saying
there will be no more
announcements on this webpage.
All future announcements
will be on Piazza.
Then, let's see.
Here is the schedule in terms of, I guess,
everything that's not lecture.
So there'll be five assignments,
and you can see on what
topics they're going to be.
And you can see when they'll go out,
when they'll be due.
There'll be a final project.
The proposal will be due early November.
There's a stretch here
where there's three weeks
between assignments rather than two.
And so, that's when your
final project proposal
will be due.
There will be a midterm.
This is new.
We've never had a midterm in this class.
I'm very excited about it.
I hope you're equally excited.
(laughing)
I'll say a little more
about it in a moment.
Then, there'll be final
project presentations
during, well, this week in December.
We'll figure out the exact timing
as the semester progresses.
And then, final project reports,
oh, I think it says Monday,
but it's actually a Sunday.
So it should say Sunday at 12:15,
final project reports are due.
Let me double-click on the midterm thing
because that's maybe the
most surprising thing.
So here's the thinking.
What we cover in the class is gonna be
a lot of very fundamental
intuitions and derivations,
and they're the kind of things that,
if you really know them,
we believe you'll be
better off in the future.
For example, if somebody asks you,
"Can you write out for me
what the LQR Derivation is?"
and that is something
you're capable of doing,
that will actually help you.
Or, if somebody says,
"Can you explain it to me
"a policy grade,"
you can just go on the whiteboard
and write it out, this is how
you derive policy gradient.
And so, the midterm is
not really something
to stress out about.
We're actually gonna give you
the questions ahead of time.
We're gonna give you the
answers ahead of time.
And this is gonna show you
exactly what we care about.
Now, we're gonna give you
probably 20 questions,
and we're only gonna
randomly sample a few of them
because, you know, it's a lot
to write out 20 pages of answers.
But the idea is that
every midterm question
will be roughly a half-page
to a page of derivation,
it's actually given to you,
that we think is so fundamental
that you should just know it,
and not something you would have to say,
oh, somebody asked me on policy gradient.
I need to go look up some notes
to remember what it is.
And right now, maybe, for some of you
that sounds to me, like,
"Oh, I can always go look things up,"
but think about it.
If somebody asked you,
"What is a derivative?"
you wouldn't have to go back
to your high school textbook
and say, "Oh, it's like
plus-delta, minus-delta,
"divide, and then, actually,
there's this other way
"to do it, and a gradient is
a vector of those things."
You just know that.
And it's a building block you can reuse.
And so, we're gonna take
a bunch of building blocks
that we feel are important enough
for you to be able to just do on the spot,
and you'll do them on
the spot on the midterms.
But you'll know exactly what's coming.
If you're good at memorization, it's easy.
But the derivations are set up in way
that they should be very logical.
So, ideally, you'll
actually reason through them
as you study them, rather
than just, you know,
photocopy them into your head,
and then, photocopy them back out.
Since the questions are
given ahead of time,
we'll figure out the
schedule of the midterm later
in the semester.
We'll see how many people are here,
how big a room we need
to have and so forth.
Because it's not like,
if we have two sessions,
that we need to worry about
anybody leaking the questions
because we already leaked the questions
ahead of time to you.
(audience laughing)
So we'll figure it out.
And so, well, if you are
traveling that entire week,
let us know early so we're aware.
But, as long as you're not traveling
that entire week, things should be fine.
There should be a way to make it work,
hopefully, in two or three sessions,
get all of you in.
Any questions about that?
Because that's definitely the
newest part in the course.
Yes?
- [Student] That would
closed note because it's,
- Absolutely, closed notes,
I like that question.
(laughing)
With a 20-page cheat sheet would be, yeah,
you could just hand in three
of your cheat sheet pages.
(audience laughing)
Absolutely.
But it's good you asked the question.
Completely closed notes.
Any other questions?
Okay, so the homeworks will, typically
be a mix of you deriving, maybe,
an extension of something
we covered in class.
And that might be like 10%
to 20% of the homework,
and then, 80% or so will
be implementing things
that we covered in class
in some simulated robotics environments.
But for your final
project, you're, of course,
encouraged to also work with real robots.
But, for the homework,
it'll all be in simulation
whenever we have things
that need to be run.
This is, approximately, the topics.
It might change just a little bit
as we go through the semester.
Then, the assignment policy is,
you can collaborate with other students
in the sense that you
are allowed to discuss
what you're doing, what you're working on,
what works, what doesn't work.
But you must write your
own code completely
and you must write up your
own solutions completely.
So all code and all
writeup has to be your own
and you have to be able
to do it independently.
You can sit next to somebody
and type it up seeing
what they wrote up.
You have to be able to go sit on your own
and write up your report.
Late assignments, so you have a total
of seven late days for the semester.
So if anything comes
up, you need extra time,
I don't know.
I can imagine, maybe, you
have a conference deadline
or something and it needs all your time
for finishing your paper,
and then, you need a few late days.
That's fine.
You have up to seven.
You can use it however you want.
You don't need to let us know.
But we'll keep track of it.
If you have to go over,
you'll lose 20 points
per day you go beyond,
20 points on that homework
per day you go beyond
the seven days total.
So if you go five days over, not useful.
Then, also, we'll drop the lowest.
So I know that many of you
have research deadlines
and so forth, and there'll
be a time in the semester
where'll you'll just need to
get your research paper out.
Or, maybe, you just have
bad luck with the homework.
You thought you solved it
but you actually
completely didn't solve it.
So we'll only score you
on the four highest-scoring homeworks.
Then, final project.
The idea here is that you,
I think I can make this
a little bigger, maybe.
The idea here's that could
be either of the following.
A theoretical or algorithmic contribution
that extends the current state of the art,
or you could do an implementation
of something we covered in class.
And, rather than real-world data,
or, ideally, real robots, and investigate
if it can actually work on
a real robot beyond what,
maybe, you did in a
homework in simulation.
We can help you with
brainstorming project topics,
you can come up with your own.
Either way is fine.
One or two students per project.
The proposal deadline is November 6th,
then project presentation's
the week of nine to 13,
and then, final paper
due on December 15th.
There's no late days
allowed for project proposal
or for a final project
report or for presentations.
They'll all need to be in those timelines.
Prereqs.
Generally, familiarity
with mathematical proofs,
probability, linear algebra, an ability
to implement algorithmic ideas in code.
One thing that we're changing
is the previous edition
of the class was four years ago.
At the time, MATLAB was very popular.
Today, NumPy is very popular.
So the code that you'll
write will be in NumPy.
At least the starter code we'll give you
will be in NumPy.
If you wanna later covert it
into something else, that's fine.
But, if you just wanna
fill in the starter code
we give you, that's gonna
be all Python NumPy.
Then, in terms of enrollment,
priority goes to PhD students.
If you're an undergrad or masters student,
there are codes we can give
out on a one-on-one basis
based on your record
which is not just your
absolute record, but also,
compared with other people
who would try to take the class
and see whoever seemed like the most fit
given the space that we have.
Class goals.
We want you to understand
the math and algorithms
underneath state of the
art robotic systems.
A lot of them are built on optimization
and probabilistic reasoning.
And these actually have many
applications beyond robotics.
So you'll see tools used
in econics of robotics,
but you'll be able to reuse them also
in other settings.
You'll implement and experiment
with these algorithms.
In terms of other goals,
we want you to be able to understand
any research papers written
in the space of robotics.
So these are, at least,
the ones that are not on mechanical side.
So all the ones that are algorithmic.
These are the main venues.
And we want to give you some
time to try out extensions
of your own in the final project.
Grading, five assignments.
Drop the lowest.
Open ended final project's 30%.
Assignments are 60%.
And then, midterm is 10%.
And then, here is the syllabus.
Not gonna go through
all of that right now,
but you can see what we plan to cover.
We'll also have some industry guests.
Like just, one of the things, I think,
that's really starting
to happen in robotics
is that it used to be more
of a peer research field,
but now, it's actually
becoming very practical.
There are many companies being built
that are robotics companies.
And so, we're bringing in a
few industry guest lecturers
to give you some notion
of what's happening there
directly from the people doing it.
So I think our first
industry guest lecturer here
is two former Berkeley postdocs,
Jur van den Berg and Sachin Patil
who used to be at Waymo self-driving car,
auto self-driving car,
Uber self-driving car
but then started there own company, Ike,
for self-driving trucks.
So they're gonna be in October.
Then, Adam Bry, got his PhD, I believe,
in Nick Roy's lab at MIT.
CEO and founder of Skydio,
which is a drone company.
So we'll have him.
And then, Drago Anguelov,
who's the research director of Waymo,
the Google self-driving car effort.
We'll have him in November
and there might be one more.
For some topics, we'll also
have dedicated guest lecturers
who are the world-leading experts
on this particular topic.
So, for particle filters,
we'll have Wolfram Burgard,
one of the three authors of the main book
on probabilistic reasoning for robotics.
So he'll come for that lecture.
And, for Sim2Real, we'll
have Josh Tobin from OpenAI,
who's kinda pioneered that field
in the last few years and really pushed it
to the next level.
And then, some related materials
at the bottom of the website.
Just if you wanna read more
on a topic or feel like,
oh, I don't have enough background on this
and it seems to be assumed
that we have a little more
familiarity with this topic,
here is a bunch of references
that you go to check out to learn more.
Any questions about the
website and plan here?
Yes?
- [Student] For the final project,
is it allowed to be overlapped
with the other course?
- So final projects are okay
to overlap with other courses.
We'll just have higher expectations
because you're,
essentially, using it twice,
so you should do something
twice as impressive, hopefully.
But I would encourage that, actually,
because usually, I mean, ideally,
your final project turns
into a paper you can publish.
And that's a lot of work and
you need to move very fast
or you'll get scooped on
what you're trying to do.
And so, it's better to put
all your eggs into one basket
when you're trying to write a paper,
even though, usually,
people don't say that.
Put it all in one basket,
try to get the paper
out, see what happens,
then move on to the next one.
And so, I would encourage
it if it's allowed
in the other class.
I don't know if it's allowed there.
Any other questions?
Let's see.
Then, we've done this slide.
So now, maybe, let's talk
a little bit about why
you might wanna study this class,
and then, we'll go through some, yes?
- [Student] Question, are the
final projects individual?
Or?
- Final projects, either,
alone or in a group of two.
Either way is fine.
Harry?
- [Harry] Is there a video lecture?
- Is there, oh, are
the lecture's recorded?
Yes, they are.
So there's somebody in the
back recording lecture and,
I don't know exactly when they go up,
but, yeah, so it should also alleviate
some of the space need in this room.
Probably, the lecture goes
up later today or tomorrow.
Not sure.
Yes?
- [Student] What's the
size of the project group?
- Oh.
(audience laughing)
Are we sure my mic is working?
One or two people.
- [Student] Oh, thank you.
- Any other questions?
Okay, so why take this class?
Well, maybe, we can take it one level up.
Why study robotics?
I would say there's two main reasons
you might wanna study robotics.
One, it's a fantastic
test-bed for AI research.
And what do I mean with that?
Well, the real world
is a lot less forgiving
than simulation and games.
And so, things that work really well
in Atari or some other
simulated environment
might just not work all
that well on a real robot.
And so, it can give you
a lot of inspiration
for what's missing from
existing algorithms,
such that they can actually be good enough
to work in the real-world environments.
Another reason you might wanna study
is also research perspective is that,
biology actually provides evidence
that intelligence can emerge
in physical environments,
namely, animals and so forth.
They were physically
interacting with the world,
and then became intelligent
somehow in the process.
And so, it's a good proof of concept.
Whereas, if you have, let's say,
a suite games, it's less
clear how much of intelligence
is needed to solve that suite of games,
and then, less clear how far you can get
by solving those games.
You might have some guesses on that but,
with real-world, you're actually sure.
That to be fully-functional
in the real world,
you need very advanced intelligence.
Then, the other thing that
pretty exciting about robotics,
if you make progress, it
can have a direct impact
in the real world.
And this could be, I don't know,
household robots, running robots,
driving, flying, and so forth.
And, in fact, a lot of the
methods are generally applicable.
Why is it a great time to
study this specific class?
I would say, robotic hardware's
getting in great shape.
So a bit more about that soon,
but the real bottleneck at this point
is in algorithms, math, and programming,
not in the hardware.
And so, this class is
exactly focused on that
and you can unlock a lot
of robotic capabilities
by just programming them better.
There are many, many
different robotic systems,
yet a few core techniques
are near-sufficient
to rule them all.
And so, the things you learn,
we won't have to study
something very specific
to two-legged locomotion,
and then, something about
in-hand manipulation,
and then, something about flying.
It'll be the same paradigms
that apply across all of them.
What are they?
Optimization, probabilistic
reasoning, and learning.
And here are some of
the kind of environments
we'll look at a lot as
representative environments.
But, really, if you have
another type of robot,
it would probably apply, too.
And then, there's also the applicability
of those three thrusts beyond robotics.
Pretty much, any AI field,
optimization, probabilistic
reason, and learning
will play a big role.
Let me circle back to the
hardware thing for a moment.
So here, we have a video
that's now 11 years old.
PR-1 robot doing a lot
of the chores many of us
wish robots could be doing for us.
There's a catch.
This robot is tele-operated.
Eric Berger, one of the
students who built the robot
is actually puppeteering
this robot around in tele-op.
But it shows that, hardware-wise,
this robot is actually capable.
If just we had good enough software,
we could have robots in our homes
doing all these kinds of things.
(audience laughing)
Student projects, right?
Now, you might wonder about the cost.
PR-2, successor of the PR-1, $400,000.
A little expensive, so
not everybody's gonna buy
that right away even
if it is very capable.
But then, Baxter came out
from Rethink Robotics,
$30,000, four years later.
So I used to give talks around that time.
I said, "We just need to extrapolate this
"and by 2017, we should
have another factor of 13,
"so a 3,000 robot available to us."
Wasn't happening.
What happened in 2016
was a $100,000 robot,
(audience laughing)
the price went back up.
I mean, there is reasons for that.
It's a great robot, the Fetch Robot.
But the trend was definitely not
exactly going to a $3,000 robot.
So, actually, three years
ago with a few students,
oh, whoa,
(booming music)
Three years ago with a few students
here at Berkeley and a postdoc,
we decided to take it in our own hands
and we built a robot that
has a bill of materials
of only three to $4,000
depending on the scale of production.
And that's what you're seeing here.
So right now, we are
hitting, I wouldn't say
the $3,000 mark.
Probably, more at the $5,000 mark.
Once we can produce this at scale
and producing that scale is not trivial,
a lot to be figured out.
But, in principle, we're getting there.
So we need the algorithms now.
And so, what I wanna do in
the rest of this lecture
is share a few robotic success stories.
Things that have already worked.
Of course, there's a lot more to be done.
But things that have already worked,
and highlight which core
ideas are underneath them
that are in the syllabus of our class,
and then, as we progress
through the class,
we'll understand more and
more of those systems,
first pieces, and then, the entire thing.
I'd say, probably, the kind
of biggest thing happening
in robotics right now
is self-driving cars.
I mean, the number of
people employed in robotics,
the fraction of them in self-driving cars
is very, very high.
Where this new kind of increased interest
in self-driving cars started was in 2004,
DARPA organized a challenge.
You had to race your
car, a time trial race,
in the desert, and the winning vehicle
that didn't really win because
it didn't finish the race
was a CMU vehicle that
drove 7.36 miles out
of the 150 miles that
the course consisted of.
But, in 2005, they did it again.
2005, Stanford team won.
Actually, five teams finished.
And here's a video from that event.
- [Announcer] The first
time it's ever been done.
Autonomous vehicles--
- [Narrator] A storm
breaks over the desert.
Robots prepare to boldly go when no robot
has even gone before.
(tense music)
(cheering and applauding)
(siren wailing)
- [Announcer] And we have
movement from Stanley,
ladies and gentlemen,
the start of the DARPA Grand Challenge.
- [Spectators] Stanley, Stanley, Stanley!
- Berkeley had an entry, too
- After months
- it was a motorcycle.
- of tireless effort
there's a lot at stake.
- Definitely stood out.
- [Narrator] A vision they all share
- Didn't make it to the finish line.
- [Narrator] will now be put to the test.
(clapping)
- [Announcer] Launch
of the sixth bot here,
DAD are we there yet?
- [Narrator] Each one leaves
the chute with confidence,
a far cry from the first Grand Challenge
where many faltered
within sight of the start
and no robot went beyond seven miles.
During the first eight miles of the race,
Highlander gains two minutes on Stanley.
- So when we see two vehicles
pretty close together,
the one in front is the autonomous one.
The one behind is the follow vehicle
and they have a stop button.
But once you stop, you're done.
- [Narrator] Behind Stanley,
Sandstorm is closing the gap.
- Just in case something really bad
were to happen you can stop.
(tense music)
They didn't stop it anytime soon,
they let it go a little longer
just in case.
- [Narrator] Team DAD's laser
comes unbolted from the roof.
(tense music)
Behind it, Stanley is rapidly approaching.
(tense music)
But Highlander recovers on the flats
and for many miles, they
continue neck and neck.
(tense music)
Five hours after leaving
the starting line,
Stanley now leads the pack
and just five robots remain on the course.
To finish, they must wind through
a treacherous mountain pass and Stanley...
(tense music)
A blue dot appears in the distance.
(cheering)
After driving six hours and 53 minutes
at an average speed of 19 miles an hour,
Stanley is about to become
the first vehicle in history
to drive 132 miles by itself.
(cheering)
- So this was pretty amazing progress
from the best vehicle doing seven miles
to five of them finishing.
It's, of course, better if you finish
a few minutes before the other one
because you'd get a lot more fame.
But then, the next year,
CMU came in with two vehicles
second and third, I believe.
And the next year, there was a new version
of the race which is an urban challenge,
two years later, urban challenge
where now there were other vehicles.
So DARPA had their own drivers
in a city-like environment
mixed with the autonomous cars,
and they had to navigate traffic,
do some parking maneuvers and so forth.
And that one was won by CMU.
Here's some footage from,
at the time, I was a
PhD student at Stanford.
I'm pretty sure I got them
from Sebastian Thrun's lab,
who led the, Sebastian
led the Stanford team.
But this is kind of what, at the time,
things looked like for these robots.
Essentially, they had
something called a Velodyne,
which is a LIDAR system that
can look in all directions
to map out a point cloud
of what's around them,
and also it semantically
interpret what's around them
as well as having the geometry
directly from the LIDAR
and, based on that, you know,
there'd be some classical control
to control it and rules for it to follow
in navigating traffic.
Now, from there, what happened is,
and most people didn't know this.
I didn't know this.
I mean, essentially,
only people who were on the in knew this.
Google actually employed Sebastian Thrun,
to start a self-driving
car project at Google.
In fact, Sebastian started Google X.
I think, the first project within Google X
was the self-driving car project.
He also recruited people
from the CMU team,
many people from the other teams,
and they all worked together.
And it was kind of a big surprise in 2010
when Google revealed that
they'd been working on this
for several years and had
done over 140,000 miles
autonomously on regularly
streets, no less.
So that was a pretty big surprise.
Then, a year later was
the first public talk
and footage about this at IRIS 2011.
This was Sebastian Thrun, who
had led the Stanford team,
and Chris Urmson, who had led
the CMU Urban Challenge team.
And this really surprised people
because this is all autonomous,
navigating traffic in
San Francisco, no less,
dealing with pedestrians, stop signs,
other cars, toll plaza,
oncoming big trucks.
Keeping it in autonomous
mode, that's a lot of trust.
So this thing's actually
working pretty well
at that time, 2011.
It's very impressive and a huge surprise
that this was happening.
Then, by July of 2015,
they had a million miles
with 14 minor accidents reported,
none of them the fault of the car.
So, I mean, you can imagine,
if you see a self-driving car,
maybe you're distracted
and you'd bump into it
or just, people do accidents all the time.
And so, a million miles,
14 minor accidents.
It's worth remembering,
though, that actually,
there was a result long
before that in Europe.
Ernst Dickmanns from
Mercedes had highway driving
from Paris to Munich, to Odense.
1,758 kilometers with lane changes
and going up to 140 kilometers an hour.
Longest autonomous stretch, 158 k in '95.
But at the time, nobody really built on it
the way it started to build
on after the DARPA challenge.
People have also started
doing pretty crazy things.
So parking.
Parking can be annoying at times.
Maybe, there's not enough space.
So here, this team, Zico
Kolter led this this back
when he was still at Stanford
but he's now a professor at CMU.
This is all autonomous.
(wind blowing)
So these are the kind of
things that, actually,
a robot can do better than a human,
well, most humans, at least,
because you can do much
more precise control,
but it's also very hard because, I mean,
the friction forces here
are not easily predictable.
So you need pretty advanced
control to make this happen.
But they made this happen back in 2010.
Around 2011, '12,
actually, a lot of activity
started to emerge in the
self-driving car space.
This is a map that Comet Labs made
with all of the activity in self-driving.
Like the full self-driving efforts, like,
I just wanna build the LIDAR,
or I just wanna build a better camera.
I just want to build entertainment
for people who are in an
autonomous car, and so forth.
So a lot of effort happening.
Massive market.
People have started to kind
of delineate what it means
to be at a certain level,
though, the definitions are
maybe not always super clear
when you're two versus
three, and so forth.
But there's an attempt, at least,
to understand what the progress is.
So no automation is Level Zero.
Driver assistance is, maybe,
it'll brake or something.
Partial automation, conditional
automation, high automation.
Essentially, it goes from a notion of,
you have to always be there as a driver,
to you can go asleep in the car.
Most companies right now
that are working on this
are at two/three.
That's where they're operating,
and trying to push it
out to further levels.
You might wonder how is
progress being measured.
Well, it depends.
One way to measure it is by
the number of miles driven,
on average, before you
need a disengagement.
Because all these cars tend
to have a safety pilot in them
and that pilot will take over,
the human person takes over.
And, for Waymo, miles per
disengagement is over 11,000,
GM Cruise, 5,000, Zoox is almost 2,000,
Nuro, 1,000, and so forth.
Now, these numbers are for California.
If you test your self-driving
car in California,
you have to report those numbers.
Some companies don't test in California,
so there wouldn't be any such numbers.
Or they might not do the majority
of their testing in California,
then the numbers might
not be representative.
You can also game these
numbers, obviously.
There are certain types of
driving that are easier.
You could say, I'm gonna just
always drive on the highway.
That's probably gonna be easier
than if you are in city traffic.
And so, you can't
necessarily directly say,
this one is that much
better than that one,
but at least it's representative
of progress being made
and who might be in the lead.
Over the years, this is plotted out here,
the number of disengagements
per thousand miles.
So this is the logarithmic scale here.
That's the same plot in
non-logarithmic scale.
Logarithmic scale plot.
Humans are here, so humans
have 10 to the minus five
fatal accidents per thousand miles.
Humans have, roughly,
10 to the negative three
injuries per thousand miles.
And somewhere between 10 negative two,
10 negative three crashes
per thousand miles.
And then, here is the Waymo car.
Black is Cruise.
Then the other green is Zoox.
Then up there in purple
is Nissan, and so forth.
So it's a downward trend.
People are getting closer.
But still, quite a gap between
human and autonomous cars.
Some people are very bullish.
Some people say, I mean,
you've probably seen
Elon Musk, for example,
say we'll have a whole fleet in 2020,
something along those lines.
It's a very bullish kind of stance.
Probably, the most bullish.
Other people are more conservative.
Recently, there was an
article in New York Times
saying that they're still
very far in the future
before we're really there,
because it's kind of the long tail
of things that happen in the real world.
It's very hard to deal with.
You don't encounter it very often.
And so, it takes a long time
to get it all covered.
It's hard to predict the future, right?
I mean, who knows who's gonna be right?
But there's definitely a wide
range of opinions right now.
We'll have some industry guest lecturers.
So we'll have Drago Anguelov,
recent director of Waymo.
Jur and/or Sachin, co-founders of Ike,
come talk to us about
their thoughts on this.
But their lectures won't
just be like predicting
when it's gonna happen.
They're gonna mostly talk
about some of the ideas
we already covered in class
but how they had to take
them to the next level
to make it work, specifically,
for self-driving cars
or self-driving trucks.
Any questions about self-driving cars?
Yeah?
- [Student] How long do
you you think it'll take?
- Oh, I don't know how to
predict that very well.
But, yeah, very hard to predict.
So, I think it depends a
lot on how you frame it.
So, deal with everything
is very, very hard.
If you're say, I just wanna do highways,
and whenever there's something
unexpected on the highway
I'm gonna have, let's say,
cameras monitoring the
situations on the highway,
if there's construction,
it's not a place the car
gets to go autonomously.
If there is an accident,
anything kind of unexpected,
only when there's regular
traffic will the cars go there,
I can imagine some of these companies
have that already solved.
I don't know.
I try not to know because
the way you get to know
is by signing NDAs and
then I can't guess here
what might be happening.
But, if you do it fully
it's gonna be hard.
Because there's just a
lot of unexpected things
that can come up.
But Audi has declared they
have, I think, level four.
Which means the car is responsible.
That they have that coming out now
for highway driving up to,
is it 30 miles an hour,
some speed limit.
So whenever you're kinda
stuck in commute traffic
heavy traffic jam,
they're willing to commit
that the car is actually
the one that's responsible.
That's what you see at level four.
And driver can just let go.
So that might be the closest
of kind of something that's
explicitly coming out
where they're willing to put the car fully
as the responsible entity.
Yeah?
- [Student] How important
do you think the LIDAR is?
- Yeah, huge debate around the need
for LIDAR or not LIDAR.
So a few nice things about LIDAR.
I mean, LIDAR gives you,
you send the laser beam,
see how long it takes
to come back, it tells you
exactly how far away it is
because you have a clocking emission,
you know, speed of light times time spent
then you know how far away something is.
It works at night, which,
cameras have more trouble with.
But it also has issues.
It doesn't see very far out.
It's hard to see, let's
say, 200 meters out.
When you're driving a
truck, heavily loaded,
you need to see far out.
So technology might need to be improved,
they're starting to get
out 200 meter LIDAR,
but is it fully there?
Not clear.
How much light do you
really get to reflect back?
If the surfaces are very dark,
or specular, or a mirror,
you see a mirror reflects
things back really well
but actually reflects things back in a way
that doesn't come back to you typically.
If it's at a slight angle from you,
your laser beam goes to the mirror
and then doesn't come back to you.
Whereas more matte
surfaces have the tendency
that at least some of the
laser beam comes back to you.
And so highly specular is an
issue, very dark is an issue.
And so some people say well,
given that you can't do
those anyway with LIDAR,
maybe you should just use cameras.
And I guess Tesla is fully
betting on the camera only,
well, camera and radar solution,
whereas, it seems pretty
much everybody else
is betting on, it's
better to have redundancy
even if we can do it with camera only
it's nice to have the redundancy of LIDAR.
It's more expensive, of
course, to also have it
but some companies in
that map that I showed
are trying to build cheap LIDAR
such that, it's not necessarily
a big extra expense.
I'd say jury is out on
what's gonna happen.
In principle, humans do
it from just eyesight.
So in principle, I mean,
there is a reasoning
behind the notion that
you only need cameras.
But the counterpart to that is why not
be more redundant if you can.
Like you have two independent
factors of failure,
both of them failing at the same time
is much lower probability
than any one of them
failing at a given time.
Yes?
- [Student] Can we go back
to the stats picture, please?
- This one?
- [Student] Yeah, it
seems like in recent years
it's trying to, like, flatting out.
What's the problem right there?
- So I don't know.
Maybe we can ask Drago
because he's somewhat
in charge of this curve
when he is here.
If I had to guess I can see two reasons,
one is that it's just
getting harder to improve
because the difficult
things don't occur as often.
And so takes more time to encounter them.
The other thing I can imagine is that
if you think about progress,
if you really think about,
not about showing off,
but you think about I wanna make
as much progress as possible,
you kinda wanna encounter
the need for disengagement
as often as possible.
Because whenever you encounter the need
for disengagement, that
means that your system
was not capable yet,
and you're able to teach it something new.
And so there could be
something at play there
where people just try to make sure.
I mean, to not do highway
driving at some level
over here all the time
because you learn very little.
Rather do city driving
where you are up there
but you learn a lot more.
I'm not sure.
We should ask our visitors
when they're here.
- [Student] Weather condition
makes a self-driving car
harder, driving at night,
driving in raining days, or?
- Yeah so the suggestion is
that the weather conditions
can effect the ability
to drive, absolutely.
I mean there is, I mean, if it's snowing,
it's much harder to see
things also for humans
but ideally a car could see.
And there's some really cool work
at Carnegie Mellon about
seeing through the raindrops
where they have a special kinda camera
where it's like the
raindrops just disappear
from what you're watching.
Maybe, I don't know if that's
possible with everything.
I mean, take fog.
Maybe you can't see through it.
Maybe other sensory
modalities can get through it
at different wavelengths.
I'm not sure.
Good question.
I'm gonna move to the next topic.
I'm happy to answer more
questions after lecture.
So another one, actually what
we forgot to emphasize here
is the things at play.
Kalman filtering, optimal control, LQR,
mapping, terrain and object recognition
are key things you need to
do for self driving cars.
Let's take a look at
autonomous helicopters.
So we'll dive a lot deeper
into this in a future lecture.
(machine whining)
But what we see here is inverted takeoff.
Hover.
And then it's gonna do
a half roll followed by a half loop
which is called a split S,
it's a way to change direction.
A snap roll, stall turn is
another way to change direction.
Loops, so these are all kinda tricks
that only extremely advanced
helicopter pilots can do.
Loops with a pirouette at the top.
Stall turn coming out backwards
then hurricane, which means
fast backward flying circle.
I think it went up to, maybe,
maximum speed helicopter achieved
is maybe 15 miles an hour.
Inverted hover.
And actually some of the hardest
things are happening now.
It might look less dynamic
but because the helicopter is maneuvering
in its airflow it just generated,
that airflow is much more turbulent
and much more difficult to deal with
and so harder to control the helicopter.
Flips, tic toc, like a clock.
Not a very efficient
way to stay in the air.
And then inverted hover
which is a very efficient way
to stay in the air.
So techniques that go into this,
Kalman filtering,
model-predictive control,
LQR, system ID, and trajectory learning.
And we'll look at all of those
throughout the course of the semester.
We'll also have, this result was actually
from my PhD work at Stanford
11 years ago, so 2008.
Since then there's been
a lot of drone startups.
And we'll have Adam Bry,
founder and CEO of Skydio
present to us later this semester.
Legged locomotion.
It's actually also a DARPA project.
They wanted to see
if robots can traverse
complicated terrains.
Some of the ideas that went into it
shown at the bottom there.
Value iteration, receding horizon control,
motion planning, inverse
reinforcement learning.
Let's take a look at what happens
when you have no learning
in the loop here.
So what doe that mean?
The learning here was about
learning a good reward function
that describes what's a good behavior.
And so here it's just gonna plan
against a very simple reward function
which is try to get to the other side
with no cost associated, any pose is fine.
But it doesn't have a good dynamics model
and if you have a bad dynamics model
and your reward function
doesn't keep you away
from treacherous positions,
well you see what happens.
And this is something you'll often see
in real-world robotics is
where you actually have,
even though in theory you might think
of reward and dynamics as
completely independent things.
In practice, your dynamics
will not be very precise
and it'll use your reward function
to avoid going to places
where your dynamics model
is not very good.
(class chuckles)
It tried it's best.
Now here, here's what happens
when it has been trained
from human demonstrations
what a good reward function could be.
It's using that and you see
even though it's still
slipping and sliding
at times it's never getting stuck
and it's nicely getting across.
Another thing we'll look at,
and Wolfram Burgard, one
of our guest lecturers
is one of the world experts on this,
is mapping and this is some of his work.
Just a moment.
Here is what you get if
you do naive mapping.
What I mean with naive,
just kind of a baseline.
You have a wheeled robot,
you have odometry data,
you know how much each
of your two wheels moved.
It's a big, two big wheels.
You know how you've
moved, you have a LIDAR,
and you just keep track of the points
that come back from your LIDAR scan
and you map out based on that
what the building looks like.
You can see that that doesn't really work
and the reason it doesn't
work is because the odometry
from the wheels is not precise.
The wheels slip.
You might have the exact measurements
for how big the wheels are,
the tires might get a bit
more, get a little less
so you don't know the
diameter, and so forth.
Here's what happens when you
use something called fastSLAM
which knows how to
incorporate the LIDAR directly
into the positioning and the mapping
rather than just mapping
and you see a few different
hypotheses at play here,
different particles in
the particle filter,
but we actually, this
is from the same data
and this one actually comes
out if you do it right.
We had a question from here?
- [Student] Yeah, you said
that the learning robot
had like learned from a
previous human interaction.
- Mm-hmm.
- [Student] Some operation to--
- Well, what we did there,
for the legged robot,
we essentially gave it foot placements.
We gave it a bunch of terrains
and we said you wanna now move
this foot to that location
and then it would run motion
planning under the hood
to actually place it there.
And then we repeat.
- [Student] Okay.
- And so it would learn about
high-quality foot placements
through that process
and use that reward function related
to high-quality foot
placements to then plan
on its own on a new terrain
it hadn't been on before.
So it did not get any
demonstrations in the terrain
that we saw it walk over.
That was a test terrain, kept out.
Yes?
- [Student] It seems like
a good foot placement
would lead you away from getting places
where you could possibly get stuck.
- Correct.
- [Student] So it'll find it easier.
- That's exactly what happened, mm-hmm.
Here is a project,
I would say this was
maybe the first project
in recent times that revisited the notion
that maybe we can start building,
or at least in the research lab,
start building towards home robots
that do a bunch of chores
for us, or office robots.
This is the STAIR project.
Morgan Quigley, who is
since founder and well,
still always at Open
Source Robotics Foundation,
he also built ROS which actually
came out of this project.
ROS is the robot operating system.
Let's take a look at this video,
here.
Actually, we want more
sound for this one I think.
- [STAIR] I will go get
the stapler for you.
- Oh we missed the beginning.
- [Morgan] STAIR, please fetch
the stapler from my office.
- [STAIR] I will go get
the stapler for you.
- So what you see, the bottom left here
is the map the robot was
making as it's navigating.
Well, actually its
localizing the insta-map
that was made ahead of time using SLAM.
Navigating to Morgan's office,
it's acquired a strategy
for opening doors.
Now it's scanning to find a stapler.
This is pre-success of deep neural nets,
this is around 2008, 2009
but still possible to, if you were looking
for specific objects you wanna find,
you could build a dedicated
system that could find them.
It's picking it up.
Going back.
- [STAIR] Here is your stapler.
(class chuckles)
Use it wisely.
(clicks)
- So, what went into this?
SLAM, which is simultaneous
localization and mapping
to build a map of the
building ahead of time
and then localization against that map.
Motion planning for navigation
and motion planning for grasping,
grasp point selection,
visual category recognition,
speech recognition, and speech synthesis.
We won't cover the last two in this class.
Then, let's see.
Here's a project we did out of Berkeley
a little later.
This is a robot that's
supposed to organize laundry.
Let's see.
Video is a little bit sped up.
I think this video was sped
up 50 times or something.
So it take 50, we could fill,
actually I think two full lectures
watching this in real-time.
But we're gonna watch it
not in real-time, obviously.
But, actually it's super reliable.
This was in 2010.
Jeremy Maitin-Shepard was
a PhD student leading this.
And it reliably folded
(class laughing)
in a very distinct and perfectionist way,
50 towels in a row.
Can you imagine how long
Jeremy was sitting there,
each towel takes about 20
to 25 minutes times 50.
Extremely, extremely dedicated.
The robot also very dedicated.
(class laughs)
It's trying to find a corner,
trying to find another corner
that neighbors the first corner.
The one at the bottom
is not the one you want.
You want a neighboring corner.
And it checks, like,
is there some kind of
twist at the top or not,
let's make sure it's nice and straight.
And then you can go to the other table,
do your folding session.
(class laughs)
Now she has a separate
stack for the small towels.
What's underneath?
Localization, motion
planning for navigation
and for grasping, grasp point selection,
and visual recognition
of where the corners are,
whether there's a twist
line, and the shape
when it's folding.
Then a little later, we start looking at
if robots can learn skills on their own
rather than directly programming them.
So this is actually the
guided pulse search results
that now Professor Sergey Levine
here at Berkeley pioneered.
He pioneered them as a
PhD student at Stanford
in simulated environments
and then came to Berkeley,
started getting it to work on real robots
as a postdoc here and of course,
got many more things working since.
But this was the first, I would say,
learning to manipulate
result in the modern era
of deep learning and show
that it's really possible
in a reasonable amount of time
to learn basic manipulation skills.
And here's a range of skills.
Oh, that didn't play.
A range of skills that was learned.
Here we're watching from
the robot's point of view.
It had been trained to hang coat hangers
and this is a test scene,
the previous one was a training scene.
It's learned to put the block
in the matching opening.
A demo we still run today, pretty often.
Get the claw of the hammer
underneath the nail.
Actually the learning was very, very fast.
So for, if it was just
motor skill learning
it could learn it in 20 minutes.
And if it was, or
sometimes even 10 minutes.
And if it was visuomotor skill learning
it could learn it,
so vision system and control system,
in a little over an hour.
Which is very, very fast.
We'll see more details about
what's underneath of that.
Here's another thing
that is starting to become possible.
Is to have, kind of
very general approaches.
So what we'll see here
is a policy gradient approach in action.
That was the first kind of big,
deep reinforced learning success achieved,
that was achieved here at Berkeley.
The first deep early success
was the Atari results
from DeepMind in 2013.
At the same time here at Berkeley
we're having similar results
in robotics environments
and this was the first
kind of big one there.
John Schulman, Sergey
Levine, Philipp Moritz
worked on this.
And you can actually see that this thing
can learn to run.
But of course, the beauty is
not just that it learns to run,
the beauty is that this
is a general approach.
You can run the same
algorithm on a new robot,
it'll learn to control a new robot.
And so it's a very general
way of solving problems.
This takes about two weeks
if you were to do it in real-time.
So it's not unreasonable.
Like if MuJoCo runs faster than real-time
but if you think through,
okay, if I were to run this in real-time
what would it take?
Now you might have some
hyper-parameter tuning
and after a few weeks you
go again, again, again.
But once you have the
right hyper-parameters
if the real world is not too much harder
than a simulated world for the scenario
then it would take about two weeks.
And then here's a different robot,
four legged, exact same algorithm running,
no change, and it learns
to control this robot.
This one is based on reinforced learning,
specifically policy gradients
and value function approximation.
And the beauty is that you can also
do other things with the same robot.
So here's Ant running fast,
actually faster than
is realistic probably,
so it finds sometimes
a bug in the simulator.
(class laughs)
And here the reward is
for if you get the head
close to standing head high.
The closer to standing head height,
the more reward you get.
And so sitting is much better
than lying on the ground
but standing would be even better.
And it also figures that out.
(class laughs)
Very dynamic way of getting up.
At the time, NASA was doing
a project with this robot.
Kind of a crazy robot.
Look at it.
It's just a bunch of bars and cables.
Actually, Berkeley has a project
designing some of these robots also.
They're called tensegrity robots.
And Professor Agogino in
mechanical engineering
is designing similar
robots here at Berkeley.
Now, to control this
robot, the way you do it
is by lengthening or
shortening the cables.
It'll change the shape of the robot,
the center of mass will shift,
it'll tumble, and then you go again.
But it's very hard to get
it to do that reliably.
And so we started working
together with NASA
on this project.
Marvin Zhang, Young Geng were the ones
leading it here at Berkeley.
And we showed it was possible
to actually fully train in simulation,
learn a policy in simulation,
that can then in real-world,
control this robot.
So this is a real-time video.
This is after training simulation testing
in the real world and it works.
So we'll look at that quite a bit
towards the end of the semester.
A sync to real aspects,
how do you get something
that you can get to work
in simulation to work in the real world?
Actually, one aside about this robot,
this robot is actually meant
for planetary exploration.
And so you might wonder,
why not just a Mars rover
that's a car, rather than a SUPERBall?
The thinking here is
that, if you have a car,
the size of the wheels might determine
what kind of obstacles you can get over.
And so if your wheels are a certain size,
well the obstacles can only be so big.
If you have a sphere, then the whole robot
is the size of your wheel,
'cause you can roll in
any direction in principle
and this is an approximation of a sphere.
The other thing that's
nice about this robot
is that even though it's big now,
you can lengthen the cables to relax it,
put it in a very small package,
a bit like a tent.
A tent is big when you set it up
and when you package it
to put it in your backpack
or your car it's small.
Same thing's happening here.
And the volume you take up in a rocket
comes at a high price,
so it's nice to be able
to have something that's small volume
when you transport it.
Usually when these, well, robots
land on a different planet
they have airbags to dampen
the shock on landing.
And so with something like this
it has some built in springy-ness also.
So it might need less airbag
damping when it lands.
Here's another example
I wanna share with you.
This one illustrates imitation learning,
also function approximation
and trajectory optimization.
Trajectory optimization
we'll see pretty soon.
And function approximation also,
imitation leaning will
come a little later.
And we're gonna watch
here is a robot learning
to tie knots from demonstrations.
So what we're gonna have here
is a single demonstration.
And this robot has not
learned anything before.
Like, it starts just with the
imitation learning algorithm.
It'll get one demo of
what it's supposed to do.
And so it's a phase demo.
There's a snapshot of the scene
then there is some robot motion,
another snapshot of the
scene, some robot motion.
And now it's gonna have
to do it on it's own
for a different initial shape of the rope.
Takes a snapshot, then
executes the first phase
which is not exactly the same motion
that was there before,
it would not succeed.
But it actually has an
adaptation mechanism
to adapt it to the current scene
which is inside a type of
function approximation used here.
Took another snapshot, now
it takes another snapshot.
And again, adapts that motion.
And so essentially it has a way
of warping old situations
up to new situations.
So you can imagine that
the rope is in some shape,
the old shape, there's the new shape.
And you can just 3D warp the
old shape onto the new shape
then maybe if you
extrapolate that 3D warp,
to 3D space, not just the rope itself,
but extrapolated warp can be used
to define the new motion of the robot.
And that's exactly what's happening here.
Requires a bit of machinery to do that
but that's the intuition.
More complicated knot.
Double overhand knot.
We actually bought a book
on knot tying for this one
to see if we could do
a wide range of knots.
With object in the scene.
This work was led by John Schulman
who also led the work on the
policy gradients learning
to run that you saw two slides ago.
All right.
Another thing that is
definitely starting to happen
in many places is that well,
when you have robots
learning, unlike children,
like, you have a child,
you teach it something.
You have the next child, you
gotta start from scratch.
Like, you can't just say your brother
or sister already learned
this, now you do it too.
But with a robot you can actually do that.
You say, hey, your brothers and sisters
already know how to do this.
Just use their brain and lean together.
And that's what's happening here.
So, actually, when now
Professor Sergey Levine
here at Berkeley,
well after he finished his postdoc here,
he spent a year at Google
and spent that time
there doing many things,
of course, including setting up
a large scale robot learning set up
with many, many robots
sharing the same brain,
collecting data for each other,
becoming better and
better much more quickly
than you can with one robot.
Somebody's pinging every now and then.
Now, a lot of what we talked about
in the robotics setting here
kind of alludes to the home-space.
When I'm looking at manipulation,
let's say, folding towels,
maybe playing with toys,
but it's pretty likely
that in the robotic manipulation space,
the near term commercializations
will not be in the home.
Because the home is
really, really difficult.
Always different for everyone.
And there's people in the mix
and people can make things much harder
than when it's just objects.
The more likely space
we're gonna see a lot of,
kind of, deployment of robots
that are doing manipulation
is manufacturing and logistics.
And so one example, many of
you may have already head of
is Amazon has this challenge,
or had this challenge
where essentially they wanna automate
some of the picking in their warehouses.
And so they wrote out a challenge
and I have a video here
from one of the teams,
I think this is the
team that came in second
but they had kind of a clean video.
(quirky music)
and so the challenge task here
is can you reliably pick
up one object at a time
and maybe, depending
on which object it is,
move it to the left bin or the right bin.
And that's a really hard problem.
And I think what's kind of,
it's deceiving, of course 'cause I mean,
in the videos you see
things are successful
and just like, you know,
somebody demos a self-driving car to you
and you look at the website,
the car is successful
but nobody actually has
a fully self-driving car
even though the videos look good.
Same thing happens here.
It's all about the long
tail of real world scenarios
that makes it hard.
Like, yeah, he might do a
hundred picks successfully
but can you do 10 million
picks successfully?
It's a whole different challenge.
And so that's a space
where there was a lot of activity.
Not just Amazon, but
other places that say,
okay, this has become
more and more important
to push this forward
and it's simpler than,
you know, organizing a home.
It's just from bin to bin.
But it's still super,
super hard to do this
in a commercially viable way.
Okay, any questions about lecture so far?
Yes?
- [Student] Why suction cup but not hand?
- So, yeah.
I mean, it just turns out that the objects
in that challenge can be picked
with a suction cup it seems.
And if that's possible all we need to find
is a target point,
you don't need to find two contact points.
It's a simpler motion.
And I think that's also why,
if you look at manufacturing
and warehousing type environments,
you can put more structure
on to what you do.
You might say, I don't know,
the robot only gets presented with objects
where suction works.
If it doesn't work, we'll
send it to somewhere else.
Or maybe have one robot that has suction,
another robot has gripper,
and depending on the object,
it might be routed one
way or the other way.
And so you have a lot more
control in those environments
and that's why you can kind of
cut off some of that
long tail a little bit,
compared to being a completely
unstructured environment.
Yes.
- [Student] By a long tail, do you mean
things that happen more infrequently?
- So what I mean with long tail,
all that matters is long and heavy tail.
So the notion that if you look at
the distribution over things you encounter
that even though the most frequent thing
is much more frequent than
the least frequent things
there's so many infrequent
things that together
they add up to a lot of
probability mass still.
And so the fact that they together add up
to a lot of probability mass means
that you really need to
deal with them reliably
but it's always something different
but it adds up to a
lot of probability mass
and so, as a fraction of the number
of things you see it still happens a lot.
You can't ignore it just
because it's far out.
So really should say long, heavy tail.
And so the graph I have in mind,
to make it more clear is essentially
on the horizontal axis
is scenarios that you lay out.
And where you put them
out on the horizonal axis
depends on how frequently
that scenario occurs.
And so you can kinda see like,
things going further and further,
how much probability
mass if something occurs,
I don't know, once every thousand days
it's all the way out there.
Once every ten days is out here.
But the things together that only occur
every thousand days together add up
to a lot of probability mass.
So you need to be able to deal with.
I would say that's the big
challenge in self driving cars.
Like, something unexpected happens,
yet something else, yet something else.
'Cause if you had seen it before
you can find a solution to it.
Either you hard code a
solution that's amenable to it
or you get more data of that type
and train a network to figure that out.
But it's the things
you've never seen before
where you might not generalize to.
Yes.
- [Student] I like to see the
robot in the previous slide
is able to, you know, recognize
different types of things
that it's grabbing.
Like, if you're grabbing a glass,
you have to move it careful, right?
You might use the low strength
but when you're grabbing
like stick or rock,
whatever you could just grab it
using strength.
- I don't know the answer for that robot.
I suspect that that challenge did not
have anything that
breaks when you drop it.
I suspect everything is strong enough
to be just dropped.
- [Student] So if it was
something like what I said before,
- Well I think, I mean,
you can try to solve any type of problem.
I mean, whether it's solvable
today is the big question
I would say.
But you're asking the right questions.
It's not just about geometry,
it's also about material,
and it could come up
as you go to something,
if you push into it,
turns out if it's a soft material
you can push into it no problem,
actually makes it easier.
See, grasping something soft
might actually be easier
'cause you can just like go right into it
and grab it as a wad.
Whereas something rigid, you push into it
it'll jump away from underneath
your suction cup or gripper
and so there's different
considerations there
of what makes things hard or easy
and it's at this point, I would say,
not a solved problem.
Because there's just so many axes
of variation to deal with.
- [Student] Yeah because I just saw
that you demonstrated the task
grabbing a bunch of flowers, yeah.
- Oh, with the Blue robots?
So that was done in tele-operation.
So the Blue robot video
was done in tele-op.
Yeah, thanks for asking
that clarification question.
- [Student] I think it's a
pretty elegant work to do that.
Grabbing something really easy to damage.
- I agree, yeah.
- [Student] What happens
today with the Blue robots?
- So what's the status of the Blue robots?
So the status of the Blue robot is that
we are able to build them in house,
but it's a lot of work.
So we can't build them at scale in house.
So we know we have a bill of materials
that's low enough in terms of cost
that we should be able to sell them
for $5,000 per robot.
But we can't build a
thousand of them in house.
So the current months and
next, probably half year
we're trying to figure out
manufacturing processes.
Not necessarily automated,
but outsourced manufacturing
such that it can be scaled up
and we can actually start
selling them for 5,000.
Now there's a lot of challenges there.
Manufacturing is not a, I mean,
I'm sure a lot of you follow Elon Musk
because he does a lot of cool stuff.
And I mean, he has mentioned many times
you know, setting up
manufacturing for Tesla car
is not that easy.
I mean we don't have to
manufacture a whole car
but anything manufacturing
is actually pretty non-trivial
if you wanna do it well.
Make sure people build it exactly right,
to spec, takes time figuring that out.
So hopeful in the next three to six months
we can figure that out.
More realistic, maybe six to nine months.
And we have some robots at Berkeley.
I mean, if you ever wanna
come and check it out
we can demo them to you, absolutely.
So, today was a first
lecture, intro lecture
to kind of give you an idea of,
you know, what's the space of robotics,
what are the kind of
things we'll understand
throughout the course of this semester.
But most lectures will not be structured
as a sequence of exciting videos.
They'll be very different.
So usually we'll have some
kind of problem setting
that's somewhat mathematical
that underlies a lot of
problems that we've seen today
but now it's more abstract setting.
We'll have some key idea or intuition
of how we might wanna solve
that, go through that.
And then we'll have some
kind of clean derivation
of the algorithm for a simple setting.
And this will form much of the core
of the kind of things
that will be on the pages
that you'll study for your midterm.
It'll be like, okay,
what's kind of the simplest derivation
of policy gradient
system, derivation of LQR,
value iteration and so forth.
And so it won't be
exactly true every lecture
but the idea is that
pretty much every lecture
we'll have one core idea
that we really want to
understand in detail
and for that idea we'll work through it
on the white board or I'll use an iPad
depending on what's easier to kinda read.
Maybe this room, work on
iPad and projecting it there
will be easier than white board to see.
We'll see.
And actually strongly encourage you
to work through those with me.
So I strongly encourage you to bring a pen
and a sheet or two of paper
to kinda write through those derivations
with me because that's
gonna internalize it
a lot more than if you're just watching.
Because in any derivation
going from one line
to the next line it's always simple.
I mean, that's by definition.
A derivation should be such
that every step is simple
but as you write through it,
work through the whole thing,
you're forced to see the whole derivation,
the full picture, and think through,
well, did I write this
correctly, what did I miss?
You learn a lot already by just doing that
compared to when you're watching.
And this will be the main source
of those midterm exam questions.
But again we'll make it
a very specific catalog.
You don't have to ask for every lecture
like, is this one gonna
be one of the questions.
We'll make a catalog
at least a few weeks before the midterm
with things that say,
these are the questions
and these are the answers
and you'll know exactly what they are.
Then typically what we'll do
is we'll take our time to
really get these details right.
Then we'll look at extensions
which will go pretty fast.
So it'll be all kinds of extensions, like,
in fact, we did for linear case,
if you wanna do for non-linear,
here's like the tweak
you do and now it works
for non-linear and for
those we'll go pretty fast
such that throughout the
course of one lecture
we cover a lot of material
but will be kind of a deliberate imbalance
between kinda going slow and very detailed
on the core of the material
and then much faster on the other things.
The other things will be in your homework,
project, and so forth
but usually it's not worth it
to go through everything at
the same level of detail.
And usually there will be one or sometimes
we'll repeat just one
or two key take aways
from each lecture, a new formulas
and that we hope you really understand
and are able to derive on
your own going forward.
Any questions about this?
Yes.
- [Student] Will these
slides before lecture?
- Yes, so slides will go up online
the night before on Piazza.
So they were up last night,
now, you did not get an announcement
that there was a Piazza so, I mean,
it was kind of maybe not
that useful last night.
I think there was 30 people
on Piazza last night.
Not like, 150 or something
that there's in the class.
But they'll always go up the night before.
Yes?
- [Student] So will the assignments be
like application based
or will they be like,
write down the derivation
for this specific topic?
- So, there'll be two types of
questions in the assignments.
A bunch of them will be application based.
So it might be, maybe we
see value iteration in class
and then we might give you an environment,
an MDP which is the environment type
that you use for value iteration
and say okay, now
implement value iteration
and run it on this environment.
Report on the results.
And then we might ask about variation,
we might say, well, what
if you run it this way
or that way, how is the result different?
So that's one type.
Another type would be extensions.
Where we say okay, I mean I don't know
if this is gonna be the case or not,
but there might be something where we say
we derived maybe, I don't know,
we derived value iteration
for the canonical kits.
The normal kits.
But now in your assignment
maybe you have to derive on your own
the maximum entropy
version of value iteration.
And we'll give you some
search to say okay,
this is the general framework,
this is what you optimize
in max N version of value iteration.
Can you come up with
the equations yourself
that are the updated equations for it?
That's it for today.
Your main to-dos are
checking out the webpage
and signing up on Piazza.
And if you have any more questions
just come up front now
or hit us up on Piazza.
Thank you and see you on Tuesday.
(class murmuring)
