Dear Fellow Scholars, this is Two Minute Papers
with Károly Zsolnai-Fehér.
Today, we’re going to talk about OpenAI’s
robot hand that dexterously manipulates and
solves a Rubik’s cube.
Here you can marvel at this majestic result.
Now, why did I use the term dexterously manipulate
a Rubik’s cube?
In this project, there are two problems to
solve.
One, finding out what kind of rotation we
need to get closer to a solved cube, and adjusting
the finger positions to be able to execute
these prescribed rotations.
And this paper is about the latter, which
means the rotation sequences are given by
a previously existing algorithm, and OpenAI’s
method manipulates the hand to be able to
follow this algorithm.
To rephrase it, the robot hand doesn’t really
know how to solve the cube and is told what
to do, and the contribution lies in the robot
figuring out how to execute these rotations.
If you take only one thing from this video,
let it be this thought.
Now, to perform all this, we have to first
solve a problem in a computer simulation where
we can learn and iterate quickly, and then,
transfer everything the agent learned there
to the real world, and hope that it obtained
general knowledge that indeed can be applied
there.
This task is one of my favorites.
However, no simulation is as detailed as the
real world, and as every experienced student
knows very well, things that are written in
the textbook might not always work exactly
the same in practice.
So the problem formulation naturally emerges
- our job is to prepare this AI in this simulation
so it becomes good enough to perform well
in the real world.
Well, good news, first, let’s think about
the fact that in a simulation, we can train
much faster as we are not bound by the physical
limits of the robot hand - in a simulation,
we are bound by our processing power, which
is much, much more vast and is growing every
day.
So, this means that the simulated environments
can be as grueling as we can make them be,
what’s even more, we can do something that
OpenAI refers to as Automatic Domain Randomization.
This is one of the key contributions of this
paper.
The domain randomization part means that it
creates a large number of random environments,
each of which are a little different, and
the AI is meant to learn how to account for
these differences and hopefully, as a result,
obtain general knowledge about our world.
The automatic part is responsible for detecting
how much randomization the neural network
can shoulder, and hence, the difficulty of
these random environments is increased over
time.
So, how good are the results?
Well, spectacular.
In fact, hold on to your papers, because it
can not only dexterously manipulate and solve
the cube, but we can even hamstring the hand
in many different ways and it will still be
able to do well.
And I am telling you, scientists at OpenAI
got very creative in tormenting this little
hand.
They added a rubber glove, tied multiple fingers
together, threw a blanket on it, and pushed
it around with a plush giraffe and a pen.
It still worked.
This is a testament to the usefulness of the
mentioned automatic domain randomization technique.
What’s more, if you have a look at the paper,
you will even see how well it was able to
recover from a randomly breaking joint.
What a time to be alive!
As always, some limitations apply.
The hand is only able to solve the cube about
60% of the time for simpler cases, and the
success rate drops to 20% for the most difficult
ones.
If it gets stuck, it typically does in the
first few rotations.
But so far, we have been able to do this 0%
of the time, and given that the first steps
towards cracking the problem are almost always
the hardest, I have no doubt that two more
papers down the line, this will become significantly
more reliable.
But you know what, we are talking about OpenAI,
make it one paper.
This episode has been supported by Weights
& Biases.
Weights & Biases provides tools to track your
experiments in your deep learning projects.
It can save you a ton of time and money in
these projects and is being used by OpenAI,
Toyota Research, Stanford and Berkeley.
Here you see a write-up of theirs where they
explain how to visualize the gradients running
through your models, and illustrate it through
the example of predicting protein structure.
They also have a live example that you can
try!
Make sure to visit them through wandb.com/papers
or just click the link in the video description
and you can get a free demo today.
Our thanks to Weights & Biases for helping
us make better videos for you.
Thanks for watching and for your generous
support, and I'll see you next time!
