Okay thank you everyone, this is a very important
occasion.
You're about to witness a defense of the first
student to come through the Robotics PhD program.
I want to give thanks to the faculty who got
together and wrote the documents that allowed
us to create a master's and PhD program.
That happened in May, 2013.
I was at MIT at the time and not here doing
the hard work.
Art Kuo locked Ella Atkins, Dawn Tilbury,
Ryan Eustice, Ed Olsen and Ben Kuipers, I
believe, in a room with a set of Google Docs
and said get to work.
They were led through the process, got nice
good drafts, and we started the hard work
of pushing things through Rackham and the
College of Engineering.
And, when you create a graduate program in
the state of Michigan, it has to go through
something called the College of University
Presidents at the state level to get approved.
That all got done in record time.
So by January, 2014, the graduate program
was approved.
We stole a few students from departments to
get the thing going, because we were way past
the time to be recruiting them in any other
manner.
And that group included Josh Mangelson, Ross
Hartley, Mia Stevens, and the young lady who
spoke at our groundbreaking, Katie Skinner.
With that, I'll let Ryan Eustice and Ram Vasudevan
introduce Josh.
So this is one of the great parts of the job
of being a faculty member.
You get to come up here and see one of your
students mature and really flourish and be
ready for that next stage of their career.
I can remember when Josh came in like a starry-eyed
brand new grad student.
Didn't know anything about marine robotics
and everything that goes and entails with
that.
I think what you'll find today in the thesis
talk is you'll see a really rigorous, very
strong, mathematically-rich thesis.
And really his co-advisor Ram Vasudevan has
been really important I think in terms of
helping to co-mentor and shepherd Josh through
this.
The one thing that's probably not going to
come through, though, in this talk is that
putting electronics in the ocean is a highly
unnatural act.
[laughter]
So the experimental side of what goes into
the success that you see here, it comes down
to inspecting three-cent o-rings and looking
for nicks in them, knowing how to put the
right amount of lubrication on it and that
kind of stuff so that when you put these robots
in the ocean, really, that you have the opportunity
to do this rich experimentation and get the
data back.
Josh has spent many weeks bobbing up and down
in a boat off the coast of San Diego, looking
at the hulls of big ships and doing this kind
of autonomous ship hull inspection.
As Josh got a bug for wanting to kind of keep
staying in marine robotics and keep chugging
along and it's really great to see that after
he graduates at Michigan, he's going to go
off and do a post-doc with Michael Kess at
the Carnegie Mellon Robotics Institute for
a year, also working in this area of marine
robotics, and then he's also going to be accepting
a faculty position at BYU in electrical and
computer engineering, and wants to continue
to pursue working in marine robotics.
With that, Josh, come up and tell us what
you've been doing.
Sure, thanks!
I'd like to thank my committee as well, and
everyone here, actually, because I've had
a lot of people help over the last couple
of years and I've really, really enjoyed being
here.
Okay, so one of the things I'm interested
in is developing reliable autonomous systems
that are able to perform real work in real
world environments.
Within the manufacturing domain, robotic systems
have been able to lead to a huge difference
the last several years.
Here we have a video of about 20 robotic systems
that are all working in collaboration with
one another in order to perform a manufacturing
task.
Within this domain, robotic systems have led
to dramatic increases in efficiency, dramatic
increases in safety, decreases in cost.
But you can think of other situations where
robotic systems would be able to make a huge
difference, as well.
If you think about moving team robotic systems
such as this out to real world environments,
they could have an even greater impact on
society.
If you think just about our civil infrastructure,
we have large man-made structures such as
dams, bridges, and piers, that are all in
regular need of inspection.
The American Society of Civil Engineers estimates
that we have 90,000 dams in the United States,
and 17% of those have a high hazard potential.
Same thing we bridges, we have 614,000 bridges,
and 9% of those are structurally deficient.
So in order to repair these differences, we
need to be able to perform inspection of all
these structures.
The way inspection is currently carried out
is through the use of divers.
That can be time consuming, it can be expensive,
and it can be dangerous for the divers in
many cases.
Sometimes they can take days for them to cover
these large structures.
But you can think of a situation instead,
where you have a team of 10 robotic vehicles
that are all working in collaboration with
one another in order to automatically perform
inspection tasks such as this.
If you were able to do this, you could more
regularly perform these inspections, more
quickly, and at a cheaper cost for the inspection
task, and you could also provide quantitative
results.
So it's difficult for a divers to determine
if they've actually covered the entire structure
because they are visually checking to see
if they swim around to be able to look at
the structure.
But you could build a 3D model of these underwater
environments if you have robotic systems.
But we haven't seen robotic systems such as
this be nearly as widely adopted as we have
in this domain.
Part of the reason is because within these
unstructured domains, there are a set of problems
that are hard to do, hard to solve in a reliable
manner.
While within this domain, the problems are
heavily simplified.
I want to talk about what some of those unique
problems are that exist within these unstructured
environments that don't necessarily exist
with manufacturing.
One of those first problems is called localization
and mapping.
Localization and mapping corresponds to determining
where a robotic system is in its environment,
and also determining where the other robotic
systems, if it's a multi robot team, where
they are in relation.
Within the manufacturing domain, this is a
heavily simplified problem because the robotic
systems are actually bolted into the ground.
So we know exactly where they are in space.
They are able to measure with a high degree
of accuracy where their arm is in space.
If you move out into unstructured environments
such as the underwater domain, you need to
be able to estimate where you are in space
as you move and potentially build up a map
of the environment that you're moving through.
So here we have a video of the robotic vehicles
I've worked on for the past several years,
it's performing an inspection task.
As it moves through the environment, it has
to build up an estimate of where it's been.
This blue line represents its trajectory,
and in order to estimate its trajectory, it's
fusing measurements of its motion and then
also taking measurements of the environment
using cameras and sonar, and then correcting
its estimate of its trajectory as it moves
through its environment based on that.
So as you can see, when it passes an object
it's seen before, you get this red line and
you can see it corrects.
This ability to estimate trajectories as it
moves through space needs to be done really
reliably for a robotic system to be able to
navigate in an uncertain environment.
The reason for that is because if you have
a bad estimate of your trajectory, you can
end up taking a vehicle that looks like this
and make it look like this.
[laughter]
It's really important for us to be able to
estimate where we are in space well, and also
build up a map of our environment in order
to navigate.
A second problem that's really important is
perception.
Perception corresponds to taking in data of
your environment and then determining where
objects are in your environment that you can
then interact with.
In the manufacturing domain, that's also a
heavily simplified problem and they don't
actually have to perform perception, and that's
because the objects they interact with are
on a line that moves them to an exact location
with millimeter accuracy.
It's just preprogrammed into these arms where
to put the bolt, so they don't actually have
to perform perception to determining where
the systems are that they're working with,
or the things in the environment that they
need to interact with.
But when you and I navigate in space, we use
our sense of sight, sense of hearing, potentially
our sense of touch in order to determine where
we are based on what objects are around us.
Robotic systems have to do the same thing.
These sensors, camera, lidar, sonar when you're
in the underwater environment, in order to
perceive where you are in space and perceive
what's around them.
Here we have an example of two images that
were taken by this robotic vehicle as it was
performing an inspection task.
What it's trying to do is match what it's
seen in the past to an image that it's currently
seeing.
If it's able to make a match between those,
then they can either use that information
to correct its estimate of its trajectory,
or identify an object in the environment that
it's trying to interact with.
But we run into complications and difficulties
in this problem because of the complexities
in the environment.
So there are multiple locations in an environment
that look very similar, especially if you're
interacting with manmade structures.
Here we have three images that were all taken
from different locations on the ship hull
that all look very similar.
If you were to match to an incorrect image,
you could end up either drastically affecting
the estimate of your trajectory or interacting
with an object that wasn't the target.
Being able to handle outlier measurements
is also an important problem when we're trying
to solve these problems.
But, how are these problems solved currently?
They're usually solved by using probabilistic
graphical models in order to represent the
uncertainty in the data that we're observing
and we form an optimization problem on top
of that to be able to estimate certain parameters
of the trajectory of the underwater vehicle.
This works really well because it enables
us to model the noise and the uncertainty,
but there are specific cases where this may
tend to fail.
And, as you can tell, these optimization problems
tend to be quite complex optimization problems
and sometimes they may have multiple minima
and if get caught in a local minima they might
output something that could cause a failure.
One of the things that I'm interested in is
reformulating these problems in a way where
we can guarantee that we find a globally optimal
solution or avoid the reliability or robustness
failures.
As Ryan mentioned, for the past couple of
years I've been working on a project where
we have two robotic vehicles that we use to
perform regular inspections of ship hulls.
A couple of times a year, we go out to San
Diego and we'll use these two different vehicles
in order to perform an inspection task on
these ship hulls.
What we're trying to do is we're trying to
split up the work between these agents so
we can more efficiently cover these and perform
these inspection tasks.
Either way, a diver could take a day or two
to perform an inspection task, where these
robotic vehicles sometimes could take several
hours.
If we need to inspect a ship hull in a short
amount of time, we want to be able to split
that work between multiple vehicles in order
to increase the efficiency.
In order to do that, we have to build up a
map, have each vehicle build up a map of the
ship hull that they're inspecting, and then
merge that into a single consistent map in
real time.
There's been a lot of work that's looked at
how you perform maps or navigation in this
way.
The existing methods tend to have a certain
set of failure cases.
So my research is mainly focused in how to
solve some of those failure cases and resolve
them in a way where we can more reliably perform
these inspections.
So to give you an outline of my presentation,
first I want to talk a little about SLAM and
what are some of the state of the art methods
for solving navigation and those failure cases,
and an overview of the contributions.
Then I'm going to dive into a couple of the
details of the methods, some of them will
be much more higher level due to time.
Then I'm going to talk a little at the end
about future work and then also reliable robotic
autonomy.
The way that these robotic systems navigate
in space and the state of the art methods
that are used to solve these currently is
called pose graph simultaneous localization
and mapping.
What we do is we take a robotic vehicle, and
we take measurements of its environment as
well as measurements of its motion in space,
and use that to estimate our trajectory as
well as build up a map of the environment.
Here we're going to have a robotic vehicle
shown by this blue triangle here, and then
each of these stars represent locations or
landmarks in the environment.
And this blue ellipse represents the uncertainty
of the robotic vehicle, so it represents how
certain the vehicle is of it's location, so
it's 99% sure it falls within this space.
The first thing it does is it's able to observe
objects in the environment, and as it moves
it's able to measure it's motion, so here
the robotic vehicle has taken a step forward
and it's able to measure its motion shown
by this green line here, and you can tell
that the uncertainty has increased.
The reason for that is because its measurement
is uncertain, it knows it approximately moved
this much.
But now if we're able to observe something
that we've seen in the past, then we're able
to shrink that uncertainty, because we can
use the information of what we've seen before.
As the robotic vehicle continues moving through
space, it adds additional locations that it
tries to estimate and observes more objects,
but if it's able to reobserve something, it's
able to dramatically shrink the estimate.
This is the general idea but the way that
we formulate these problems is using factor
graphs.
I want to step through this one more time
and talk about the probabilistic graphical
model that we use in order to represent problems.
So in a factor graph there are two different
types of nodes, the first node is a variable
node, this X1 here corresponds to the position
in orientation of the robotic vehicle at time
step 1, and then we have a pretty good estimate
of where we are in space so this factor node
represents a measurement telling us about
where we are.
As we step forward in space, we know define
the position of the robotic vehicle to two
different time steps, and we have another
factor that's driving measurement that tells
us how these two variable should be related
to one another.
If we reobserve something, we've seen before,
that becomes another factor for measuring
within this graph, then we continue moving
forward and now we have a representation problem
that we want to solve.
We have a set of positions of the robotic
vehicle, denoted by X, and then a set of measurements
to tell us how they're related to each other.
If we model the probability of each of these
measurements, we can formulate this as a maximum
estimation problem.
So we want to find the set of positions that
maximizes the likelihood of the measurements.
And if we make certain assumptions on, such
as added Gaussian noise and independent measurements,
then we can formulate this as a weighted [inaudible]
weighted square problem.
So what we have here is a function F that
predicts what our measurements should look
like, each of these individual measurements,
we predict what they should look like based
on the current estimates of our predictions,
and then we try to minimize the error between
what we predict it should look like with what
they actually do.
There's been a lot of work that looks at how
to solve this efficiently, how we can solve
it quickly, and for large graphs, and develop
scale for these problems.
But there are some failure cases of this method,
and I want to talk about some of those failure
cases.
The first failure case is when we have outlier
measurements.
If we were to observe an object in space,
and it turns out to be a different object
than what we thought it was, so if we see
this star here, and we actually think we see
this star, then we're going to end up with
an incorrect edge in this graph and that corresponds
to an incorrect term in our cost and it will
drastically effect the estimate of our trajectory.
So these methods tend to fail when you have
outliers.
They also tend to fail if you have a bad initial
guess.
The optimization algorithms that are used
to solve these are usually non-linear optimization
algorithms and so they take the initial estimate
and find the closest local minima and so if
you get a bad initialization then there will
be a problem.
Then they also tend to fail if we have an
inaccurate estimate or uncertainty after we
solve.
We run into problems when we... after we solve
something we want to be able to use the information
we've estimated and know how well we have
our estimate, how likely our estimates are
to be true then we need to be able to accurately
characterize them.
So toward these three failure cases, we've
posed four contributions.
The first one is a method that was published
in ICRA that talks about how we can take two
robot maps that were generated in overlapping
areas and a set of potential [inaudible] and
merge them into a single consistent map while
handling the fact that we may have a large
number of outliers.
The second one talks about how we can align
trajectories with one another when we don't
have a good initial estimate and if we don't
have high dimensional data, [inaudible].
That was published in OCEANS.
The third one is a way of framing the general
SLAM simultaneous localization and mapping
problem in a way to guarantee we can find
a globally optimal solution even if we don't
have a good initialization.
That will appear in ICRA this year.
Then we also have a paper that's under review
for TRO that talks about how we can accurately
characterize the uncertainty of pose after
you solve.
Now I'm going to jump into the first one here,
then we'll continue.
As I mentioned before, especially for man-made
structures, there are multiple locations in
an environment that look similar.
And see here, each of these images are taken
at different locations on the ship hull, but
if you're actually over here, and you think
that you've seen the subject over here when
you're trying to estimate your trajectory,
it's going to try to force those two locations
to be nearby one another, which is going to
affect your estimate of the trajectories.
To show an example of this, this is a trajectory
of a robotic vehicle that moved through a
building, just by adding two outliers into
this graph, it drastically affects our trajectory.
Things get even more complicated when you
have multiple vehicles that are simultaneous
trying to navigate together.
Each vehicle is estimating their own trajectory
as they move through space, and then we have
a set of potential loop closures are factored
that tell us how they should be related but
90% of these may be outliers and we want to
be able to determine how, which of these measurements
we should use.
There has been quite a bit of work that's
looked at how we can determine and handle
these outliers when performing simultaneous
localization and mapping.
They tend to fall into two groups.
First group looks at the size of this residual
error, and if it's large then they say that
measurement must not be likely and so they
throw that measurement out.
But that's dependent on initialization, depending
on your current initial estimate.
There are also methods that look at consistency,
so they look at pairs of measurements or sets
of measurements and try to measure their consistency,
but they don't necessarily enforce that the
entire group of measurements that they select
for consistency.
So in this chapter, what we do is we formulate
this as a combinatorial optimization problem
that one, doesn't require any initial guess
and enables us to select a set of measurements
while guaranteeing the set of measurements
selected are pairwise consistent.
And then we talk about how we can transform
that into an equivalent problem to increase
that efficiency.
To formalize this a little bit more, we have
a set of measurements Z, that represent potential
measurements that relate the two graphs that
we are trying to merge, and we want to find
a subset of those measurements, Z~, a subset
of those measurements that we can trust.
The way that we're going to do that is by
looking at pairs of measurements, and checking
out how consistent they are with one another,
and then trying to find the largest set of
measurements that are all consistent.
So we have a set of measurements Z, we want
to find the subset Z~, and we specifically
want to find the largest subset Z~ where we're
constraining every pair of measurements within
Z~ to be C.
And this consistency metric really depends
on the type of measurements you have, the
type of measurements that you're observing.
But what we're going to use is called Mahalanobis
Distance, and so it enables us to take into
account uncertainty for determining how likely
these measurements are.
Josh, how big is this set typically that we're
talking about here?
Yes, for us, we're able to handle several,
200 or so, but it depends when you're in a
single robot case, you may have more than
that, it really depends on the problem you're
trying to solve.
But for when we're trying to merge two robot
maps, a couple hundred at least is good.
As Jessy alluded to a little bit, this is
a combinatorial optimization problem.
So it's hard to solve efficiently.
What we're going to do is transform it into
an instance of a maximum clique problem, so
we can leverage existing literature in order
to apply existing algorithms to solve it.
The way we do this is we look at pairwise
consistency of every possible pair of measurements,
then we build up a consistency matrix, which
is how likely each measurement is to be consistent
with each other, we then convert to a consistency
graph by thresholding this matrix, so now
in this consistency graph, each node represents
a measurement and each edge denotes consistency.
Now finding the largest possible pairwise
consistent set corresponds to finding the
maximum clique of this graph.
So we can apply any of the existing algorithms
to this graph in order to find the map.
And once we have this maximum clique, we now
just solve, merge these two graphs with one
another just the measurements.
From an evaluation standpoint, we've evaluated
this using both simulated and real world data.
In the simulated environment, we took an existing
data set, split it in half generating multi
and outlier loop closures between the two
graphs, and generated 200 synthetic nodes
in that way.
Then we also took an existing data set that
was collected here on North Campus, using
this Segway robot, and as it moved through
the environment, we took -- while it was created
as it moved through the environment, but then
we took that existing data, two runs of it,
and generated inlier and outlier loop closures
using lidar data.
To give you an idea of the color scheme here,
two trajectories for two robotic vehicles
are in blue and green, then we're going to
show false positives in red, false negatives
in pink, and true negatives in grey.
So you want to have grey, you don't want to
have any red or pink.
If you use the existing methods that enforce
a, that use a residual error in order to determine
which measurements should be trusted, if they're
given a bad initialization, it disables all
of those measurements.
Methods to locate consistency, single cluster
graph partitioning or RANSAC don't necessarily
enforce pairwise consistency, so they could
lead to inconsistent graphs.
While our method guarantees a set of measurements
that select pairwise consistency.
In addition, both RANSAC and pairwise consistency
maximization have this threshold parameter
that needs to be set, but if we're using Mahalanobis
Distance, then it's a chi squared distribution,
so we can set it up a priori beforehand.
And our method is much more robust to the
tuning of because we're looking at pairwise,
or groupwise consistency as opposed to individual.
If we vary that threshold, our method is much
less affected.
[laughter]
So that one leads to the big chunk out of
the out of the...
Yes, that one leads to the sunk robot.
From a quantitative perspective, we also outperform
each of these other methods in terms of translation
and rotation mean square error when trying
to align these graphs.
In conclusion on this chapter, we've developed
a method that doesn't require an initialization,
guarantees selected measurements are pairwise
consistent, we've talked a little about how
you can increase the efficiency of that and
we've released the code at this link.
So now we'll move on to this next one, this
is going to be a little bit higher level just
because of time.
The idea here is that in some cases you don't
have enough data, you either don't have the
sensors or you don't have the communication
in order to directly match an object that
you see in the environment for one vehicle
to another vehicle, so you may want to be
able to align a vehicle's trajectory just
based on some underlying feature.
It could be curvature of an object nearby,
such as a ship hull, or it could be bathymetry
of the sea floor.
What we want to be able to do is align this
query trajectory with a reference trajectory,
so we're assuming that two robotic vehicles
moved through this space, they've measured
this feature space, as they move through the
environment, and we want to align this trajectory
with this graph.
But we want to do it in a way where we don't
have any initial guess.
So we don't want to assume that we have any
initial determination.
So in order to do this, this kind of gets
broken into two steps.
One, we need to develop a convex cost function
so that we can determine how to align the
trajectories without any initial guess.
And the way we do that is we take points within
this query trajectory, we determine their
distance and feature space to every point
within the reference trajectory.
And we use that to build up a point cloud
that tells us how points within this space
relate to their dissimilarity in points.
Then by taking the lower convex of that point
cloud, we now have a linear cost function,
which tells us how where points in this space
should align with this space.
And by summing that cost we get a linear cost
that tells us how they should align here.
So now it's a convex cost function.
The second part is once we have this cost
function, we need to enforce that the trajectory
we're estimating is actual rigid body transformation.
The special orthogonal set of matrices is
not convex, so we end up with problems.
So the way we get around that is we use an
L2 norm approximation, so a linear approximation
of the L2 norm in order to enforce that the
rotation we're estimating is approximate.
So now we have a way of using a linear program
in order to align these trajectories.
Just to conclude, in this section we've developed
a method that's enabled us to align trajectories
with one another, based solely a low-dimensional
feature set that doesn't require any data
association and doesn't require any initialization.
Then we also released the code for this.
So now, I'm going to move on to this section
-- does anyone have questions?
What I want to deal with in this section is
determine a way that we can formulate the
SLAM problem such that we can guarantee we
can find a globally optimal solution every
single time without having any prerequisite
or conditions on our initial estimate for
the optimization.
So as we talked about before, this is the
cost function of the optimization problem
that's often used in order to solve simultaneous
localization and mapping problem.
And this F function is often very non-linear.
So this results in a cost function that has
multiple minima.
What we want to find, to find the maximum
likely estimate, or the true estimate of our
trajectory is find a globally optimal solution.
But if we're given an initialization that
isn't really, isn't based on the traction
of the true global optimal, the way these
algorithms work is they descend to the closest
local minimum.
So we're actually going to end up finding
a different solution than the actual estimate
of our trajectory.
This is important again because we're using
this to do navigation so the true solution
or the globally optimal solution should be
our estimate of our trajectory in space, and
if we get caught at a local minima, we're
either modifying our estimate of the world,
or end up with a trajectory that doesn't actually
represent where we are.
So from a safety perspective, we run into
problems.
There's been recent work over the last couple
of years that look into how we can resolve
this problem.
The first couple methods looked at how do
we relax the optimization problem to a similar
optimization problem that isn't the actual
original problem, but it's one that's convex,
so we can guarantee we find a globally optimal
solution, and then they reinitialize the nonlinear
optimization using that initial guess.
But there's no guarantee that you're finding
the globally optimal solution.
Other works have looked at looking using Lagrangian
Duality in order to verify if you have the
true solution once you've already found an
initial guess, and most recently there was
work that looked at under certain noise assumptions,
can we guarantee that we found the globally
optimal solution.
That was SE-Sync published by David Rosen.
But none of these methods actually formulate
the optimization as a convex optimization
every single time.
So what I want to be able to do is to be able
to guarantee we can find a globally optimal
solution regardless of noise assumption and
really formulate it as a convex optimization.
We reformulate the planar SLAM problem as
a polynomial optimization, and then we guarantee
we can find the globally optimal solution
to this polynomial problem regardless of initialization.
In order to do that we have to talk about
how these poses are represented.
The way these positions of the robotic vehicle
are typically represented is in vector parameters,
so this is the robotic vehicle's position
in orientation at tilmestep i.
And that's often represented using translation
parameters and then angles which represent,
in 3D Euler angles to describe our orientations.
Then every single pose is also represented
in that way.
And in this case we're going to assume that
our measurement are also [inaudlbe] transformations,
so they also follow the same structure.
But we run into problems because the angle
here, this angle parameter, is not actually
a real parameter.
It falls on a manifold.
So things get a little bit more complicated,
and when we're subtracting our estimate of
our measurement from what our actually observed
measurement was, if the angles are both near
0 or 2pi or multiples thereof, then we run
into problems when we're trying to do subtraction.
So what we want to do here is actually represent
our orientation using SO2.
So the traditional way of representing pose
is shown here, but what we want to do is actually
use rotation matrices and translation vectors
in our cost.
More specifically we want to be able to represent
our parameters as real numbers in that space,
so we have Ci, Si, Xi, and Yi.
But in order to ensure this is a valid rotation
matrix, then we have to enforce that Ci and
Si lie on the unit circle.
So enforcing this constraint makes this be
a valid matrix.
But then we have, as we perform the optimization,
we have to enforce this non-convex constraint.
The nice thing about this though, since we're
using real numbers here, we can, when we measure
the distance between two different angles,
it actually doesn't matter if we wrap around
or not because we're measure that distance
in the equivalent of X and Y.
What we do is we take this new representation
and reformulate the optimization problem as
a maximum likely estimation problem by using
this other representation for pose.
Here now, our poses that we're trying to estimate
are R and t, the rotation matrices and translation
vectors, where we're enforcing that this matrix
which lies in the real two by two matrices
follows this constraint.
Then we also represent our measurements in
that way, so while this cost looks a little
bit more complicated, it's actually doing
the exact same thing as in the previous formulation.
This term is just measuring difference in
rotation and this is just measuring distance
in position.
So we're trying to take our existing measurements
predict what our pose has to be in order to
minimize the error.
Once we have this formulation, we can now
plug in those rotation and translation vectors
and we get a polynomial cost function.
So now we have a cost function that's trying
to estimate parameters C, S, X, and Y that
are real numbers where with the polynomial
cost and quadratic uncertainty.
So what we were able to do in this paper is
prove that this cost follows a very specific
structure.
And so we can guarantee we can find its globally
optimal solution by solving a convex semidefinite
program.
Now by solving the semidefinite program, we
can always find the globally optimal solution
to this polynomial cost max likely estimation
problem without having any initial guess of
our estimate.
And so to experimentally validate this we
took existing data sets, split them into parts
because our current method doesn't scale,
but we have ideas of how that can be done
-- as it currently stands it doesn't quite
scale.
We took a random initialization of a non-linear
optimization methods and they get caught at
local minima while our method is able to find
the globally optimal solution, and it does
this without any initial guess.
From a quantitative perspective we also took
those same datasets, split them up into chunks,
and we outperformed those non-linear solvers.
For my conclusion on this section, what we're
doing is reformulate the planar pose graph
SLAM and landmark SLAM problems as polynomial
optimization problems and then we were able
to prove that we can find the globally optimal
solution every single time.
Any questions on this?
Could you talk about the implications that
it doesn't scale and yet you can down sample
the data set and still have the optimal solution
to the down sample?
There's two answers to that, you can take
existing large data sets and break them down
into equivalent smaller data sets, right,
that is one way of trying to solve that problem.
But you can also, we have some ideas of how
you can scale this algorithm.
The structure of the SLAM problem is sparse,
so there's the sparsity of the problem that
you can leverage, in addition to that, that
sparsity also means you can kind of break
the problem up into smaller pieces.
And then guarantee to find the globally optimal
solution later on.
It's something that we're currently working
on.
That will be in your future work.
Yeah.
Has any prior work done the same sort of location
of using the S and C and forcing that constraint
on it and just not solved it in a guaranteed
optimal way?
Yeah, so there have been some methods that
they formulate it as a quadratic program,
and they do solve it but they are relaxing
the problem in a slightly different way.
They're not necessarily -- some of the initial
methods I mentioned were solving a relaxed
version of the problem to a global minimum
and using that as an initialization for another
problem, for a non-linear solver, [inaduble]
where they're enforcing something similar.
In the cost function you showed, you had an
unlabeled arg on the translation rotation,
but does it not matter because it's convex?
I was mainly showing that just to show the
main overview, the norm is dependent on the
measurement model, usually you're measuring,
you have some estimate of the noise parameters
for your measurement model, and that ties
into what norm you use.
So often we use Mahalanobis Distance, which
is a norm that scales everything by the inverse.
But you can also, when we're talking about
rotation matrices that I was checking there,
I think it's a weighted version.
Usually you use that weighting dependent on
how certain you are of your measurements being
correct.
So that does drop.
When you say scaling, how big a pose graph
are you typically solving?
So these experiments here, I can't remember
exactly, I think there were only 20 to 50
nodes or so, but this problem here, like 400,
but this takes about a minute or so.
There are ways of increasing the scalability
that we're currently working on.
When you say it doesn't scale, just the amount
of time it takes to solve?
Yes, so the amount of time it takes to solve
and then also memory can tie into it also
as well.
Does it scale at all in 3d?
Yes, so actually a paper was put on Arxiv
three days ago that extend our method.
[laughter]
In some portion, it's only doing rotation,
and it doesn't really solve the entire problem
and they can only handle 10 nodes, but there
are methods that are trying and we're also
interested in that.
Any other questions?
Now I want to move on to the last contribution
here.
As we mentioned before, after we solve the
simultaneous localization and mapping problem,
we will extract our current solution, but
we also need to know if we have a good estimate
of how well we know where the underwater vehicle
is.
So if you think about a self driving car that's
driving down the road, it needs to have an
estimate of where it is in the lane, but it
also needs to know how well it knows this.
If it's really certain of its estimate than
it can use that in order to navigate, if it's
uncertain or not correct but thinks its certain,
it may end up crossing into the other lane.
So it's important to be able to determine
a good estimate of your uncertainty, as opposed
to just finding.
So Smith Self Cheeseman published a paper
in 1990 that talked about how you can represent
pose, and represent uncertainty of your pose
specifically.
The way they did this is they use this vector
of parameters and then they modeled that as
being a multivariate Gaussian vector.
By modeling it in this way, they can use this
covariance matrix to track how well they know
their estimate of uncertainty.
In addition to that, they talked about if
you have multiple poses, you can stack those,
and model that as a larger, jointly multivariate
estimate.
And they talk about how you can perform pose
composition, pose inverse and relative pose
operations on them, so how to calculate uncertainty
through composing poses, inverting poses or
the relative pose between two poses while
modeling that uncertainty.
This is really important being able to model
that jointly correlated nature of these poses
is important because after you solve SLAM,
all of these pairs of poses are correlated
because your estimates of them are derived
from the same measurements.
Here we have an existing SLAM solution where
we've extracted poses that are offset from
each other by 50 nodes and these are colored
by correlation.
Black is a correlation of one, white is a
correlation of zero.
The majority of these poses are correlated
in some way and a lot of them are very high.
This is really nice because it enables us
to model that uncertainty, but it doesn't
necessarily really show -- it's not able to
accurately characterize the uncertainty of
real poses in the real world.
And the reason for that is because if you
have a rotation error early on in your trajectory,
then you're more likely to end up out here
than you are to end up in the middle, in here.
And so when you're enforcing a multivariate
Gaussian variable over these parameters in
a XY space, you're not able to actually use
this Gaussian representation to represent
your estimate of your pose, or accurately
model your uncertainty.
Tim Barfoot in 2014 published a paper that
looks at how you represent this pose uncertainty
in a more accurate manner.
The way he does this is he uses the group
structure, in Lie algebra of the Euclidean
group in order to represent this uncertainty.
So he represents a pose using a mean which
is actually in the group, then uses a multivariate
Gaussian that's translated to the algebra
of the group and then translated to perturb
the mean value through the exponential function,
so there's a random variable that's defined
in the Lie algebra, then we perturb our mean
variable in the group by transforming it through
the exponential function.
The nice thing about this exponential function
is it handles the non-linearity of the optimization.
So it's R6 [inaudible]...
Yes, it's R6, that's the dimension.
If you think about it in 3D, back on this
representation, it would actually be X, Y,
Z, and then R, P, H or some other combination
[inaudible].
The nice thing about this is it enables you
to directly characterize that level of uncertainty,
so shown in green here is using this parameterization
of uncertainty to accurately map the sample
poses.
In addition to that, Tim Barfoot also talked
about if you have independent poses, so if
you have two poses that are modeled in the
same way, you can propagate uncertainty through
the pose composition method or corroboration
by using the equations here.
But he does assume these poses are independent.
In order to summarize this, Smith Self Cheeseman
developed a method that enables you to represent
pose uncertainty using a set of coordinate
vectors or vector coordinates but that doesn't
necessarily enable you to accurately characterize
pose uncertainty.
They also talk about how you can handle jointly
correlated poses and then derive the pose
composition, pose inverse and relative pose
operations.Tim Barfoot talks about how you
can use Lie algebra which enables you to better
characterize that uncertainty, but he assumes
that the poses are independent and he only
looks at pose composition.
In this chapter, what we're going to do is
develop a framework for handling jointly correlated
poses in Lie algebra and then derive the first-order
uncertainty propagation methods for pose composition,
pose inverse, and relative pose, and then
we also released a C++ library implementation.
So to talk about how we characterize the jointly
correlated poses, we going to use the same
model as Tim Barfoot does here, so we have
a mean vector that's in the group, or mean
matrix in the group, and then we have a multivariate
Gaussian noise or perturbation parameter that's
in the Lie algebra and then we transform that
through the exponential function to perturb
our mean.
The difference here between what Tim Barfoot
does and what we do, is he assumes that these
are independent, while we stack these uncertainty
perturbation vectors and model them as being
jointly gaussed perturbation.
And then by using the BCH function or Baker–Campbell–Hausdorff
equation we can then derive the post composition,
pose inverse, and relative pose operations
based on this method.
And we're able to take into account correlation.
When we perform experiments we can show that
ignoring that correlation leads to an over/under
confidence, so if we're assuming positive
correlation then when you do pose composition
you under approximate uncertainty if you ignore
correlation, and when you do relative pose
you over approximate uncertainty.
So ours is shown in green, and then blue is
if you ignore correlation.
And in this case here we also performed two
different experiments to evaluate this, so
the first experiment we generated a trajectory
of transformations, we sampled the uncertainty
of each consecutive pose to be correlated
with the one before it, and then we generated
10,000 samples of those trajectories and then
compared it to our method as well as SSC or
the independent linear algebra method to determine
how many samples fell within our uncertainty
bounds.
All the methods drop off as you increase the
number of steps in our trajectory, but our
method is much more accurate.
And the same thing when we increase the noise
level.
So if we increase rotation noise, our method
also outperforms.
In this case, when you have translation noise,
SSC does get to about the same as us, but
the main reason for that is the nonlinearities
are being derived from the rotation error.
As you increase translation error, it doesn't
necessarily affect, our method doesn't necessarily
do any better than SSC because you're just
adding X, Y, Z into the coordinate.
And for a second experiment, we took an existing
SLAM data set and we extracted two pairs of
relative poses from that data set and then
used Monte Carlo to derive what the true distribution
of the relative pose operation should look
like and then compared our method to SSC in
that case, here zero error is white, dark
orange is higher.
Here we're outperforming SSC and we outperform
Smith Self Cheeseman method by an order of
magnitude.
So to summarize on this section, we've developed
a method that enables us to use Lie algebra
to jointly characterize pose uncertainty and
we derive the first-order uncertainty propagation
method or formulas for pose composition, pose
inverse, and relative pose.
And we've also released an open-source C++
implementation.
Any questions?
So you're still dropping off a number of steps.
Is that because you're using a local linearization?
No, so we're not using local linearization
formula, you could term it as a linearization,
but we're using what's called a BCH formula,
so the BCH formula describes when you multiply
rotations in groups, how does the addition
in the, how does that affect your Lie algebra
terms.
We're associating uncertainty in Lie algebra,
so we're dropping off the higher order terms
there, so as you add more and more nodes,
you're going to increase in error.
So it's in a higher order terms approximation.
It's actually kind of surprising, this slide
here, to think about composing 30 poses in
your SSC method [inaudible] it's something
we've used for a long long time.
Any other questions on this section?
Now I want to move on to the conclusion, as
I mentioned before there's four main contributions.
One is a contribution that looks at how we
represent or handle outlier measurements when
we're trying to merge maps from multiple vehicles.
The second looks at how do we align trajectories
with one another if we don't have initialization
and we don't have high dimensional data so
we can't uniquely identify positions.
The third one talks about how we can formulate
the SLAM problem or planar SLAM problem in
a globally optimal way so we can guarantee
we find a globally optimal solution, and then
the last one talks about how we can accurately
solve SLAM and accurately represent pose uncertainty
so we can increase safety.
From a future work perspective both the globally
optimal SLAM formulation as well as the first
chapter where we're talking about how we can
represent outliers or determine which measurements
we should trust, both of those are currently
using full degree of freedom.
Both of those have methods that could be used
to extend to more general types of measurements
such as range measurements, which you often
have in the underwater environment, or in
monocular image registration.
Also we can increase the reliability of these
systems or the real-time capability of these
systems, specifically if we talk about globally
optimal SLAM there's ways that we can extend
that.
In addition we've talked about reliable autonomy,
we've talked about some of the methods, the
problems that exist in this space that are
hard to solve reliably and consistently in
this space.
But being able to robustly perform those tasks
is only really a part of reliable autonomy.
When I think of reliable autonomy, I think
of four main elements: algorithmic robustness,
which is what I've been talking about so far,
there's also constraint management, system
cognizance, and system/operator cooperability.
I think each of these tie into developing
systems that are reliable.
In order to talk about those a little bit
more I'm going to use the same motivation
here, because one of the things that I'm interested
in is developing a system of five or six robotic
vehicles to automatically perform inspection
tasks.
When talking about algorithmic robustness
again, it's talking about navigation or planning,
how do you develop these algorithms in a way
where they reliably give a true solution.
Constraint management is also a system that
you've developed that your using, as well
as the domain you're operating in often enforces
constraints on the operation of your system.
When you're talking about the underwater environment,
we often have heavily limited communications,
and so being able to have a team of robots
work with one another well is dependent on
being able to operate under these conditions.
In addition, the algorithms that we're developing
here also have to run in real-time.
So there's interesting things where you look
at can you develop the algorithms in a way
where they'll be able to run in real-time.
When you talk about system cognizance, you
can increase the reliability of the system
if they're able to be aware of themselves
and also the task that they're trying to perform
and if they're able to replan based on their
awareness.
So if you think about a task that you're giving
to a robotic system and each individual team
has its own capabilities, if the system is
aware of the capabilities of the individuals
as well as the task that it's trying to perform,
it should be able to dynamically plan which
vehicles do what in order to increase the
reliability of the system.
From a similar perspective, on an individual
system level, you want each system to be reliable
or be able to handle failure cases within
its own system.
Thinking about a robotic system that has multiple
sensors, think of one sensor being used for
target tracking, one being used for obstacle
detection, and one being used for path planning,
if one of those sensors were to fail, you
would want the system to be able to determine
that that failure has occurred and start using
other data in order to continue operating,
especially if you're vehicle is down underwater.
An interesting problem to look at is how can
the vehicle be aware of itself as well as
the task that it's trying to perform and continue
operating to increase reliability.
And then also the communication between individual
vehicles with each other, as well as between
the system and operator is an important part.
So when you're talking about reliable autonomy,
if you think about how well an operator is
going to trust a system to actually perform
a task, it's important that the communication
between the operator can handle potentially
vague instructions or high-level instructions.
And I've been using this underwater perspective
in order to motivate things, but all these
same topics apply to each of these other applications.
In conclusion, I just want to thank my collaborators
as well as the members of my committee for
all their time and help, and I'd be happy
to take any questions.
