I'm researching robotics
and specifically how we can use deep
learning to build end-to-end
robotic systems directly from raw
perception data.
We're trying to actually learn these
controllers such that they don't have any
understanding of the world beforehand;
they don't understand what a lane marker
is, what a road boundary is, 
but they can learn all of this just by
interacting with their environment
in this virtual simulation engine that
we've built.
Most autonomous car companies are going
to take a pipelined approach.
So the first thing they're going to try
and do is split up the problem into many
sub-problems.
One of those sub-problems could first be
taking all of your sensory data and
fusing it together.
Once you have a representation of all of
that data you need to detect objects in
the world:
understand where all of those objects
are, what are the objects,
maybe if they're moving, if they're
static? Then once you have a
representation of where everything is
around you, you need to understand where
you are with respect to everything else. Then you can
start to plan your path around that world. What we're doing here
is radically different.
We're not going to split up this problem
at all. The entire problem is end-to-end
solvable from pixels (from the sensory
information) to the steering wheel angle.
What that means is that we try and learn a single
machine learning system in the middle
that transforms those sensory inputs
to the control outputs that we want the
car to predict without any pipelining.
Our simulation is made up of hundreds of
hours of driving data. We train this vehicle in all of these
driving conditions and in many different
scenarios in each of those conditions
What makes our simulation engine so
unique
is that it's data-driven. So it's built
off of real data of the road
which means that the more data we have
the more rich our simulation engine is
and the stronger our
vehicles will be when we try to transfer
them into the real world.
We try and make incremental progress and
incremental steps towards that full
deployment stage.
We train on all of these hundred hours
of data.
It takes our autonomous vehicle in
simulation
maybe 10 hours to go through all of this
data beforehand
and then it learns that controller and
then can apply it onto the real world
within maybe 10 hours of training.
I think it's a really interesting
research question of how we can build
autonomous vehicles
that take as input some prior
information about the world
specifically because humans have so much of that prior information already built
up for many years and years of driving.
But it's also a very interesting research
question to think about also
how we can learn these control systems
from scratch without any knowledge. It's
a huge challenge
and it's one that we don't really have a
great answer to right now because
driving is such a complicated task. Even
though we know that lane markers are
incredibly important
we don't really know the underlying
rules and foundational
control systems that govern a
self-driving car yet.
And that's what we're trying to learn from scratch!
