Dear Fellow Scholars, this is Two Minute Papers
with Károly Zsolnai-Fehér.
Finally, I have been waiting for quite a while
to cover this amazing paper, which is about
AlphaZero.
We have talked about AlphaZero before, this
is an AI that is able to play Chess, Go, and
Shogi, or in other words, Japanese chess on
a remarkably high level.
I will immediately start out by uttering the
main point of this work: the point of AlphaZero
is not to solve Chess, or any of these games.
Its main point is to show that a general AI
can be created that can perform on a superhuman
level on not one, but several different tasks
at the same time.
Let’s have a look at this image, where you
see a small part of the evaluation of AlphaZero
versus StockFish, an amazing open source chess
engine which has been consistently at or around
the top computer chess players for many years
now.
StockFish has an elo rating of over 3200 which
means that it has a winrate of over 90% against
the best human players in the world.
Now, interestingly, comparing these algorithms
is nowhere near as easy as it sounds.
This sounds curious, so why is that?
For instance, it is not enough to pit the
two algorithms against each other and see
who ends up winning.
It matters what version of Stockfish is used,
how many positions are the machines are allowed
to evaluate, how much thinking time they are
allowed, the size of hashtables, the hardware
being used, the number of threads being used,
and so on.
From the side of the chess community, these
are the details that matter.
However, from the side of an AI researcher,
what matters most is to create a general algorithm
that can play several different games on a
superhuman level.
With this constraint, it would really be a
miracle if AlphaZero were able to even put
up a good fight against Stockfish.
So, what happened?
AlphaZero played a lot of games that ended
up as draws against Stockfish and not only
that, but whenever there was a winner, it
was almost always AlphaZero.
Insanity.
And what is quite remarkable is that AlphaZero
has only trained for 4 to 7 hours only through
self-play.
Comparatively, the development of the current
version of Stockfish took more than 10 years.
You see the how reliably this AI can be trained,
the blue lines show the results of several
training runs, and they all converge to the
same result with only a tiny bit of deviation.
AlphaZero is also not a brute-force algorithm
as it evaluates fewer positions per second
than StockFish.
Kasparov put it really well in his article
where he said that AlphaZero works smarter,
not harder than previous techniques.
Even Magnus Carlsen, chess grandmaster extraordinaire
said in an interview that during his games,
he often thinks: “what would AlphaZero do
in this case?”, which I found to be quite
remarkable.
Kasparov also had many good things to say
about the new AlphaZero in a, let’s say,
very Kasparov-esque manner.
And also note that the key point is not whether
the current version of Stockfish or the one
from two months ago was used.
The key point is that Stockfish is a brilliant
Chess engine, but it is not able to play Go
or any game other than Chess.
This is the main contribution that DeepMind
was looking for with this work.
This AI can master three games at once, and
a few more papers down the line, it may be
able to master any perfect information game.
Oh my goodness.
What a time to be alive!
We have really only scratched the surface
in this video.
This was only a taste of the paper.
The evaluation section in the paper is out
of this world, so make sure to have a look
in the video description and I am convinced
that nearly any question one can possibly
think of is addressed there.
I also linked Kasparov’s editorial on this
topic.
It is short, and very readable, give it a
go!
I hope this little taste of AlphaZero inspires
you to go out and explore yourself.
This is the main message of this series.
Let me know in the comments what you think
or if you found some cool other things related
to AlphaZero!
Thanks for watching and for your generous
support, and I'll see you next time.
