Vsauce!
Kevin here, and I’ve built a computer capable
of explaining how you get smarter.
Out of these matchboxes, some colorful beads
and...
Shrek.
Real quick, before we get into the game we’re
about to play, I wanna tell you about the
game that I’ve been playing.
I partnered with Raid: Shadow Legends on this
video and if you follow me on Instagram you
know that I’m a huge fan of RPGs.
Well, Raid is the most immersive champion-collecting
experience you’ll get on a smartphone.
It has a deep story, detailed graphics, giant
boss fights, and hundreds of champions to
collect and customize.
And you can play it free.
So to check it out, download Raid using only
my link in the description to get 50,000 silver
immediately and a free epic champion courtesy
of the dev team.
Thanks again to them for supporting Vsauce2,
go check out their game, it's amazing how
far mobile gaming has come, Now let’s
get back to the inner workings of our game.
Okay.
24 matchboxes, all filled with beads, and
covered in potential moves for the game we’re
about to play and… this is our computer.
Now, Shrek comes along and...wait.
How is THIS a computer?
Aren’t computers, just like, electronic
machines that run software?
What IS a computer?
Well, the earliest computer was YOU.
Or… your ancestors.
They used calculating machines like the abacus
to input information which output a result
but we were the ones computing.
The human operators of early calculating machines
were literally called “computers.”
Okay, back to our matchboxes.
Once we introduce a game board, Shrek and
the gang, this matchbox and bead setup processes
our input, gives us an output and not only
that… it also learns.
This is not just a computer, this is an artificial
intelligence machine capable of matching wits
with the brightest minds humanity has to offer.
At a game called Hexapawn.
Here’s how.
Hexapawn is based on chess -- each player
has 3 pawns on a board with just 9 squares.
The pieces move like chess pawns, too.
They can go forward one space if that space
is unoccupied, if it is occupied by the opponent
then they can’t go forward.
Sorry, Donkey.
You can, however, move diagonally, but only
to take an opponent’s piece.
There are three ways to win:
Get a pawn to the other side of the board.
Take all of your opponent’s pieces.
Or leave your opponent without a possible
move, like a checkmate in chess.
Our setup works like this: I’ve got 24 matchboxes
here, and each one corresponds to the position
of pieces on the board during that round.
I’ve got my Team Kevin pawns vs. the computer’s
Team Shrek.
And do you know what that means?
That means that we’ve officially turned
Hexapawn into: Shreksapawn.
Alright.
Let’s play.
The human, that's me, always goes first.
Wait.
Why?
Because recreational mathematician Martin
Gardner said so.
He actually created Hexapawn and its rules
as a simplified version of a 304-matchbox
computer called MENACE.
15 years after helping the British break the
Nazis' codes in World War II, Donald Michie
invented MENACE to learn how to master Tic-Tac-Toe.
And now 59 years later, I’m on YouTube playing
Shreksapawn.
Since I go first, my moves occur in only the
odd-numbered rounds.
1, 3, 5 and 7.
Therefore, the matchboxes are grouped by possible
Team Shrek moves in rounds 2, 4, and 6.
One of us is guaranteed to win before Round
8.
So, Team Shrek has no Round 8 moves.
Each box contains one colored bead for each
potential move on that board position.
So like this first box has a green, a blue,
and a purple.
And I've cut a hole at the bottom of the box
that will only allow one bead at a time to
fall out.
So I’ll just shake the box and let one bead
out.
And it's purple.
That means if my pawn was here and it was
Team Shrek's move, Team Shrek would make the
purple arrow move.
Like this.
If a blue bead had fallen out, then Team Shrek
would've made the blue arrow move.
And if it was a Green Bead then Team Shrek
would've made the green arrow move.
And taken my pawn.
Okay so that's how Team Shrek will move.
Team Kevin will move however I want Team Kevin
to move because I'm Kevin and then we’ll
play back and forth until there’s a winner.
Alright, Round : Fight!: I decide to move
Lord Farquaad forward.
For Round 2 I now use.
this box to determine Team Shrek's move.
So we'll give it a shake.
Woah!
Let's try that again.
And it's the green move.
So Donkey moves forward.
Now it's my turn and I decide that, look,
I can just take Princess Fiona when I move
diagonally and win the game.
That's it.
Now here’s the important part.
When Team Shrek makes a losing move, I remove
that bead from the box.
That way the computer can’t make the same
bad move the next time that this situation
comes up.
By removing its losing beads, the computer
learns to play better.
When Team Shrek does win, then instead of
removing the bead I'll just put the bead back
in the box.
Okay, I’m gonna play a bunch of rounds now
and I’ll keep track of wins over here, with
a K when I win and I'll write an S for a Shrek
win.
Here we go.
Okay, I’ve played 14 games.
I started off winning a lot more than I was
losing… and then things changed.
Out of the last 7 games, Team Shrek has won
6 of them.
The computer is clearly getting better at
the game… but is it really learning?
I mean, I’m just taking beads out of matchboxes
how is that learning?
What is learning?
At the most basic level, learning is acquiring
new knowledge or a new skill, or modifying
an existing behavior.
Every time I take a bead out of a matchbox,
the computer loses a behavior that leads it
to an outcome of failure.
That increases the probability that the computer’s
move each round leads it to success -- which
in our case, is winning Shreksapawn.
After a sufficient number of games, the computer
will evolve to play perfectly.
My Team Shrek computer may not be thinking
on its own, but it is learning.
And it can also learn in a different way.
Removing beads is basically a form of learning
by punishment.
When Team Shrek makes a bad move, I’m punishing
the computer for being wrong.
I don’t have to worry about the computer
feeling bad about losing, these matchboxes
aren’t gonna get frustrated and quit playing
and run away crying and slam the door in my
face and tell me I’m not their real dad.
But what happens if instead of punishing my
computer, I reward it?
Instead of just putting the good play bead
back in the box when the computer wins, I
could add another bead of the same color that
made the winning move.
That would reduce the probability of a losing
bead appearing by increasing the probability
of a matchbox generating a winning bead.
The computer would still eventually reach
perfect play because I'll still remove the
losing beads, but it will take longer because
it's winning more often.
If it could feel, it would probably feel better
about winning more often along its longer
journey toward perfection.
So the fastest way to perfect play is by punishing
the computer’s mistakes.
But the way to win as many games as possible
along the way is to reward its victories.
To improve at hexapawn, our matchbox computer
actually uses a type of genetic algorithm.
It’s a way to solve problems and learn based
on natural selection.
Based on the process that drives biological
evolution.
The beads of learning in your life may be
refined by punishment.
Put your hand on a hot stove once, and learn
that, “Ow!
That’s painful.”
So you remove the touch-hot-stove-bead from
your brain.
They may also be augmented by rewards.
“My parents bought me ice cream for getting
an A on my exam.”
Add another get-good-grades-bead to your matchbox
head computer.
Hexapawn is an obscure, academic game from
over 50 years ago, and you can make a matchbox
computer that learns to win every time.
But by allowing this matchbox computer full
of colored beads to learn, the player who’s
learning a bit more about learning is… you.
And as always, thanks for watching.
If you wanna make your own matchbox, oh I
lost a bead, matchbox computer, download my
template for free over at Twitter.com/VsauceTwo.
That's at Vsauce T, W, O.
If you wanna watch more Vsauce2 videos, just
uh click over here, and if you aren't subscribed
to Vsauce2 then maybe you should uh, put a,
"subscribe to Vsauce2" bead in your brain.
Wow.
That was weirdly creepy.
