Hi folks, here is Ivan and this is a video of a project I developed using Artificial Intelligence.
It was my first project using A.I., and is a good project to show, and make people understand how it works, because of it's simplicity.
Simple to explain and to people understand
and to see it's potential in the fields of Science
In this video, I will show the "thing" working, but will also show "how" the thing works
Just showing the thing working is cool, but I like to add more value by explaining things
I will explain with my own words, don't expect purelly technical information from this video
If you want something technical, go to Google or Wikipedia
If I say something that you disagree, please let me know in the comments!
For those who doesn't know the game, it's because they doesn't use Google Chrome
or have a really good internet (something not true in Brazil)
But, for those who doesn't know, the dinosaur I'm talking about, it's a game from Google Chrome, a Easter Egg
When you are disconnected and try to load a page, it shows you a dino. You can press space to begin playing
Cactus come from the right, and you have to jump them. Also, you have to bend down for Piterodacteos
The idea of the game is to score points and loose your time, util your internet comes back to life
So, my idea was really simple: Make a program, without previous information about what is "good" or "bad"
that could "learn" to play the game, and score the maximum as possible
My objective with this project, was to apply what I have learned in the Internet, Google and in my "life"
I started learning about this about a year ago, and I was interested in doing something with it since I started
In a general overview, we will se two main topics:
Neural Networks, and Genetic Algorithm
I will only introduce the theory about those topics, but fell free to search and learn more deep
So: How does work, this "Dinosaur" learning to play like a ninja?
Imagine the game: The dino, the floor, and the cactus coming
How did I abstracted this, in order to apply it to an artificial intelligence (neural network)?
I thought  about Sensors.
Imagine that when you are playing the game,  you have your "vision", that interprets what is going on,
And you "get" a information about the Distance, for instance.
The distance to the first Cactus
This is a really important information to play, because it determines mostly the "when" you jump
Beyond that sensor,  we use also the "Size" of the obstacle.
If it is a large or small obstacle, you might need to jump before or after to get close to it at the end of the jump
we also observe the size from our "vision", right?
Another thing that we also observe, is the Speed. We need that because if we are really fast,
it might need to jump a little before.
So, thought in that way, simplifying it to 3 sensors.
(Distance, Size and Speed)
And he also has "Outputs", what we can call "actuators" as well, like a motor, but in this case, keys.
We can use both UP and DOWN keys in the keyboard.
Those are the only "actions" that our Dinosaur can do
Now, let's try to abstract it a bit more...
In a more general way to think.
I guess that you are more comfortable to work with CODE (if, else, while...); "Logical" things; Discrete.
We could place a program in the middle of it.
(This code is only for exemplification)
And what a program would "do" in this case, is to Read the inputs, interpret those inputs,
and find out what is the best possible output.
In this case, the output for the keys of the keyboard.
But now imagine: A program/code, is really linked to "how" you write,  it's syntax.. (we have many languages)
Imagine a machine, interpreting those lines of code, and "coding" those lines of code. It's MUCH more difficult...
an Alien, for instance, might not understand it, because he wouldn't know how to read English
and English, is not "universal".
And the way I understood that, is because you only understand a program because you can read it.
What if we were trying to think in a another way, a way more "general", or Analog (yeah, from "Analog" circuits)
Imagine that we do not have "If", "else" any more, but we have math Functions. Mapping Inputs, to Outputs
and now we can learn about Neural Network, which is a really nice way to connect/map inputs to outputs.
as if we were creating a math function that generalizes what is happening in the middle of it.
And it is analog, because we can have any value for the inputs like numeric, or sequence of values, and we will have values for the outputs.
In order to study it a little better, let's separate, and study only one input (red) and one output (orange).
The way we usually do is to put a program in the middle of it.
this "thing" in the middle, if we zoom in, could be for example a simple program.
"If input is greater than X, than let output be 0.0... otherwise be Y..." and so on.
and I said, this is a Discrete way to think about it, and is not good for a machine to learn and interpret.
the best, is to not have any connection with syntax as we said earlier.
The best, is to be analog/continuous. 
If we think about analog circuits, we come back to 0/1
Anything running in a machine is 0 or 1, and we might tend to think that all those inputs/outputs goes from 0 to 1, for instance. Or they are 0, or they are 1. (Binary)
but still Discrete, since you can describe all the possible states (0 and 1)
Now let's try to "remove" that from our mind. 0 OR 1.
Think that it can be ANYTHING between 0 and 1, for instance. Anything between it, is valid. 0.1, 0.55575789...
Anything can be valid, for one input or output.
And it can be other than 0/1, it can go from  -Infinite to Infinite, but let's simplify it and limit to 0-1.
We can even make it non-linear, like a sigmoid, tangent... We can put any function in it, like this curve function.
And if we abstract it even further,  we will see that at the end of everything, after all connections are done.
Those inputs and outputs, and the middle of it, will be basically a function. A Math function that we can write it down on paper. Like g(A+B*x)
And becomes another output, and you applies to g(A+B*x) again in the next node, and so on, recursively.
Now, we can think that each "small program", is actually, a way to map the inputs, to a single output, it's a function like "f(x) = Something"
What you put on A and B, change the way the function outputs the values, given it's inputs.
This is a simple way, right? We can map any "Line" to any other "Line". But what if we wanted something more "complex", maybe a polynomium?
We could have, in this case, more and more "Neurons".
It is like having more ability to map inputs to an output.
We can abstract it even more, and say that each connection, has a "weight" (RED), and a "Bias" (YELLOW)
Each neuron in this example, has only one input, but imagine how it would be in a more complex "Network".
It looks like this, as if we had "layers". "Inputs layer", "some Layer", "some other Layer", and the "output layer".
Those circles in Red, Yellow and Orange,  are "values" (Like our A and B). We can say what is in each one of them for a given network.
And if you know, each of those "values",  you can Repeat that behavior in another Neural Network with the same shape. Like copying and pasting...
If we know all those Weights, and values, we can repeat our "code" with an identical behavior.
If I change anything, in any of those "values", I will have a different "behavior". I will be mapping inputs to outputs in a different way...
Now let's forget about the connections, and think only about those weights...
We can map the output to anything like a Motor Power, Intensity of a Light, or in our case, what he will press (UP/NONE/DOWN)
I defined that: Output values greater than 0.55 means "press UP". Values above 0.45 means "press DOWN", and anything in between, means "release all keys"
That's how I defined, but off course I could have set it to 50 and the other 70. Or even two independent outputs (one for each key). But I chose to go like this.
Now, let's think that all the connections doesn't exist.
We will have many "weights" and "biases". "values" that configure our Network.
And now, think that we are getting all those values, and putting them in sequence.
We got that "crazy" neural network, and transformed it into a linear thing, like a DNA.
DNA is a sequence of informations, that has many small parts with information as well.
For instance, we might have a big information, like a Text file, where each line is composed of a sequence of characters, that has information as well.
Those informations, are really important to define how the living being will behave, or multiply it's cells, metabolize. It's the DNA who has the "instructions".
The way we put that in practice in a Neural Network, is with a Genetic Sequence (many values in sequence).
Now let's imagine, how could I make an "evolution" with it. How to make a program "learn"   to solve a problem?
In this case, our problem is to jump Cactus efficiently.
Imagine that we have a single "DNA", and we start all it's values randomly. And then we create "N" DNAs randomly... They are all different.
I initialize all those circles/values, with a random value because we don't know what is "right" to put in there that would cause the Neural Network to jump before the cactus and score lot's of points...
I don't know, and I don't want to know it either, because I want it to "learn" by it's own. That's why we start with random values...
And what do we do with all those "Genomes"?
For each one of them, I test it with the Game, by constantly applying the input from the sensors to the Neural Network, and mapping the output to the keys.
Basically, we put our Genome into practice, and we see how "well" he performed.
maybe how many points he scored, cactus he jumped, how long he stayed alive..
I can use anything.
In my case, I used "jumped cactus",  because good genoma, is the one who JUMPS more cactus.
Sometimes, the game takes a bit longer to appear a cactus, and not necessarily means the genome is better than the other who hit a cactus before in time.
In the end, we test all genomes, and we see which ones are the "BEST" genomes in that "Generation".
Then, we throw out the worst ones, doing "Natural Selection", or, "Artificial" (lol)
We are selecting the best ones, and removing the ones are not "that" good compared to that Generation.
What do we do next?
If we continue with those two selected genomes, is the same as saying "They are THE best. They are winners."
Not necessarily... We have to continue propagating that. We have to improve more what we already have.
The way we do that, is by getting the best ones, and creating others that CAN be better, but can also be worst.
In this part, we do Cross-Over. Getting random parts from each of two genomes, and combining them, creating another NEW genome.
A genome that looks like it's "father" and it's "mom", but has a different functionality.
After that, in order to allow even more changes, and not be limited to the values generated in the first generation,
we apply randomness in a random way, by choosing randomly some value, and adding some random value to it. We do that in some of those "values"...
that's what we call "Mutation".
We made the Natural Selection, Cross-Over, and now Mutation. Now, we have a different genome, that is not equal to it's "fathers", but it can be BETTER or WORST.
We then put it in the generation, and repeat that process, creating new genomes, until we complete the generation
Create 8, 12, 100... As many as wanted in our Generation.
Now, we have ANOTHER generation, and we test each one of them again. We re-do everything again, by Selecting the best ones, Crossing-over and mutation...
This is the basic process of a Genetic Algorithm.
You create a way of computing how good a genome is, create a way to join and create new genomes,
Now, repeat it over and over.
The number of "over and over", depends on the complexity of your problem.
I needed about 120 Generations to create a "ninja".
(Will show in the end of the video)
Well, that's how a Genetic Algorithm works using Neural Networks.
And that's how I made, to make a dinosaur jump Cactus.
Now, let's go see it in practice...
This is my development and testing environment.
I place Chrome on the Left, and Terminal on the Right.
Let's run our script in Node.js.
Right on the beginning, it finds the game origin by scanning the pixels on the screen. Then, moves the mouse to that location.
In this part of the User Interface, we have a few "bars". While I play the game, pay attention to them.
The first one is the "Distance" sensor.
Remember the sensors? This is the first one.
The second is the "Size" of the obstacle. Large obstacles have larger values...
The third one, is the Speed of the cactus.  As the game gets faster and faster, it increases over time.
"Activation", is the output of our Neural Network, the default is 0.5 (half of 1.0). Values on the bar goes from 0 to 100. In the bottom, they go from 0 to 1 (the real value)
Resuming: 3 First are INPUTS. and the 4th is the OUTPUT of the Neural Network.
Here in the bottom, we have the activation as well.
Action, is "what" he will do. Basically, the mapped values to the key (JUMP/NORM/DOWN).
In this left side, we have the current Learning status. Stop means he is not Learning yet.
Fitness starts in 0 in every new game, and each time I jump one cactus, it increases by one.
(It might add one when hit something, depending on the size of the obstacle)
1, 2.. 3, 4, 5, 6.. and so on.
Game Status is the current state read from the program. OVER means "Game Over". It reads the "Game Over" text from the game to know it's state.
Generation and those two other values, are the current Generation, current genome being tested, and total genomes being tested.
Like: "1/12" means "Executing genome 1 out of 12".
A problem that I had really serious during a live-coding I was doing.
When I capture my screen, it consumes a lot CPU/GPU from my computer, and slows down the reading from the Sensors a lot, causing a enormous delay in response.
I couldn't train the Dino because of that, because it was skipping cactus readings, and taking so long to respond to events (about 200ms of delay).
The way I solved that, in order to record a Time Lapse,  is by recording with my external camera.
It might not be the perfect image, but will still be good.
I will only start here,  to show how it works, then I change the camera.
As you can see, it generated 12 dumb-genomes, and is currently testing the first genome of the generation one, activating the keys from the output of the neural network
You can see the raw value output from the Network, varying accordingly to the inputs.
He is now testing genome 4 out of 12
Now 5 out of 12...
While saving it's fitness.
After an entire generation is run, he selects the best and do all that we talked about.
Now, let's see it in Time-lapse, and then, the ninja generation.
Many generations later...
