Dear Fellow Scholars, this is Two Minute Papers
with Károly Zsolnai-Fehér.
It is time for some minds to be blown.
DOTA 2 is a multiplayer online battle arena
game with a huge cult following and world
championship events with a prize pool of over
20 million dollars.
In this game, players form two teams and control
a hero each and use their strategy and special
abilities to defeat the other team.
OpenAI recently created an AI for this game
that is so good that they challenged the best
players in the world.
Now note that this program is not playing
the full feature set of the game, but a version
that is limited to one versus one encounters
with several other elements of the game disabled.
Since lots of strategy is involved, we always
discuss in these episodes that long-term planning
is the Achilles-heel of these learning algorithms.
A small blunder in the early game can often
snowball out of control by the end of the
match, and it is hard for the AI, and sometimes,
to even humans to identify these cases.
And this game is a huge challenge because
unlike chess and go, it has lots of incomplete
information, and even the simplified one versus
one mode involves a reasonable amount of long-term
planning.
It also involves attacks, trickery and deceiving
an opponent and can be imagined as a strategy
game that also requires significant technical
prowess to pull off the most spectacular moves.
This game is also designed in a way that new
and unfamiliar situations come up all the
time which require lots of experience and
split-second decisionmaking to master.
This is a true test for any kind of AI.
And note that this AI wasn't told anything
about the game, not even the rules, and was
just instructed to try to find a way to win.
The algorithm was trained in 24 hours, and
during this time, it not only learned the
rules and objectives of the game, but it also
pulls off remarkable tactics.
For instance, other players were very surprised
that the bot didn't take the bait, which typically
means a smart tactic involving giving up a
smaller battle in favor of winning a bigger
objective.
The AI has a ton of experience playing the
game and typically sees through these shenanigans.
In this game, there are also neutral units
that we call creep.
When killed, they grant precious gold and
experience to our opponent, so we typically
try to deny that.
If these units encounter an obstacle, they
go around it, so players developed a technique
by the name creep blocking, which is the art
of holding them up by the hero character to
minimize the distance traveled by them in
a unit of time.
And the AI has not only learned about the
existence of this technique by itself, but
it also executes it with stunning precision,
which is quite remarkable.
And again, during the training phase, it had
never seen any human play the game and do
something like this.
The other remarkable thing is that when a
player disappears in the darkness, the AI
predicts what he could be doing, plans around
it, and strikes where the player is expected
to show up.
If you remember, DeepMind's initial Go algorithm
contained a bootstrapping step where it was
fed a large amount of games by players to
grasp the basics.
The truly remarkable thing is that none of
that happened here.
This algorithm was trained for only 24 hours
and it only played against itself.
When it finally played against Dendi, the
reigning world champion, the first match was
such a treat, and I was shocked to see that
the AI has outplayed him.
In the second game, the player tried to create
a situation that he thought the AI hasn't
encountered before by giving up some creep
to it.
The program ruthlessly took advantage of this
mistake and defeated him almost immediately.
OpenAI's bot not only won, but apparently
also broke the will of Dendi, who tapped out
after two matches.
I feel like someone being hit by a sledgehammer.
I didn't even know this was being worked on!
This is such a remarkable achievement.
Usually, the first argument I hear is that
of course, the AI can play non-stop without
bathroom breaks or sleep.
While, admittedly, this is also true for some
players, the algorithm was only trained for
24 hours.
Note that this still means a stupendous amount
of games played, but in terms of training
time, given that these algorithms typically
take from weeks to months to train properly,
24 hours is nothing.
The second argument that I often hear is that
the AI should of course win every time, because
it has close to 0 reaction time and can perform
thousands of actions every second.
For instance, if we would play a game where
the goal is to perform the most amount of
actions per minute, clearly, humans with biological
limitations would stand no chance against
a computer program.
However, in this case, the number of actions
that this algorithm performs in a minute is
comparable to that of a human player.
This means that these results stem from superior
technical abilities and planning, and not
from the fact that we're talking about a computer.
We can look at this result from two different
directions.
One could be saying, well, no big deal, because
this is only a highly limited and hamstrung
version of the game, which is way less complex
than a fully-fledged 5 versus 5 team match.
Or, two, we could say that the algorithm had
shown a remarkable aptitude for learning highly
sophisticated technical maneuvers and longer-term
strategy in a difficult game.
And the rest is only a matter of time.
In fact, in 5 versus 5, there is even more
room for a highly intelligent program to shine
and create new tactics that we've never thought
of.
I would bet that if anything, we're going
to be even more surprised by the 5 versus
5 results later.
We are still lacking in details a bit, but
I have contacted the OpenAI guys who noted
that there will be more information available
in the next few days.
Whenever something new appears, I'll be here
to cover it for you Fellow Scholars.
If you are new to the series and enjoyed this
episode, make sure to subscribe and click
the bell icon for two super fun science videos
a week.
And if you find yourself interested in DOTA
2, and admittedly, it's hard not to, and would
like to catch up a bit on the basics, make
sure to visit Day9's channel who has a really
nice playlist about the fundamentals of the
game.
There is a link in the description for it,
check it out.
If you go to his channel, make sure to leave
him a kind scholarly comment.
Let the world see how courteous the Two Minute
Papers listeners are.
Thanks for watching and for your generous
support, and I'll see you next time!
