Welcome to the artificial intelligence course: a complete introduction.
The lesson today will introduce you to one of the most recognized techniques in AI artificial neural
network.
Honestly, I think that many of you have heard about a neural network before joining this class.
But don't you know that neural network has already been studied since the mid 20th century and it was
not really successful until the year 2000s?
In fact, thanks to the availability of data and the support of powerful computers, neural networks
is really coming back to become a state of the art in machine learning.
And nowadays, it has been receiving more and more attentions in studies, researchers and many real-life
applications.
Let's start with the first part of the lesson.
First of all, to give you some basic ideas of artificial neural network.
Let's look at its original version, the biological brain.
As a matter of fact, our brain may be the most important mechanism in our body.
It allow us to analyze what we sense and to make decisions.
For example, we can see everything visually because we have eyes, but actually eyes are just some
kind of sensors which receive optical information and transfer this information to the brain.
Then the brain as a processor will analyze and extract useful information from this input to make appropriate
decisions.
For example, when you are driving, you have to read traffic size in order to control your vehicle
correctly.
And sometimes the information needs to be processed in milliseconds.
So how can our brain do this?
Here is the answer.
Our brain is composed of hundred billion nerve cells called neurons, neurons are.
processing information by making an extremely large connected network.
And this kind of network is called biological neural network.
To give you an idea of how large is the network, let's imagine that in average a human brain has
around 86 billions neuros and each of them may connect up to 10000 other neurons.
This makes approximately 1000 trillions synaptic connections
can you believe that?
No let's
look at the detail of a neuron.
You can imagine that each neuron is like a processing unit with input output and a processor, particularly
information comes to the neuron's dendrites and the nucleus processes this information in terms of
an activation.
Finally, the axon is responsible for transmitting signals out of the neuron.
Precisely,
If you look at the image, here are the dendrites of the neutron, which receive signals from other
neurons or from other cells, imagine from eyes, for example.
And here is the nucleus, which is like the controller.
Next,
The body of the cell is called soma,
it contains and keeps the nucleus a functional.
And the axon stays here.
It's considered as the output of the neuron, then signals are transmitted through the axon by this
way to go to other neurons or cells, for example, to this one.
So the two are connected together.
And this new one, again, may be connected to other neurons and so on during the complete nervous
system.
For example, this neuron may output signals to cells controlling arm's actions, or this one may be to
control legs.
So that is the idea of how biological brain works to process information.
Indeed, this is not a biological class
so I couldn't give you so much detail of a neuron, but you just need to remember that a biological
neural network is composed of so many connected neurons and each neuron has some inputs, activation
or controlling unit, and one output,
That's all.
Inspired by the biological neuron.
We would like to design the artificial neuron as follows.
First, we have the neutron in the middle. In one side,
we have the inputs, called x1 x2 ... xN.
So we have N input for the neuron.
However, we want to add another input, x0 is always equal to one, as you already learned from
previous lessons about linear regression and logistic regression.
You know that this is called the bias unit.
OK.
Next,
We'd like to put a weight for each of these inputs, so phi 0 phi 1 phi2 ... phi N.
So we have vector x consisting of x0 x2 2 ... xN, is the input vector.
Similarly, vector phi is the weight vector consisting of weights or parameters to learn.
Then in another side, we have the output of the neuron y equals f(x).
It means that the output of the neuron is a function of x and parametrized by phi.
Finally, at the center of the neuron, we need an activation function which compute the output y based
on the input x and the parameter phi.
To illustrate the idea, let's remind the biological neuron.
You can imagine that the input x and the weight phi are similar to the dendrites of the neuron, whereas
at the activation function is like the nucleus and the output.
y is the axon of the neuron, OK?
It's simple.
Usually activation function is often a logistic function of phi dot x.
So we've got g(phi dot x) equals one over one plus e to the minus phi dot x.
where e is the Euler's constant.
as you have already seen this function in the lessons about logistic regression.
Indeed, logistic function is often proposed because of its beautiful properties,
but there are also alternative choices, such as the rectifier function, that I really encourage you
to spend time reading about it. In this course,
we will use logistic function
as it's a very common.
Just to remind you what is a logistic function.
Let's look at the graph of the function g(z) equals one over one plus e to the minus z.
It has a sigmoid curve, is bounded between zero and one with the midpoint is zero point five.
when z equals to zero.
OK.
Now, I want to give you a simple example.
Imagine that we have a neuron with a logistic activation function g.
And this neuron has to official inputs: x1 and x2, for example, x1 equals two and x2 equals 5.
I said official inputs because we want to add a bias unit x0
to the input vector like this.
And remember that x0 zero is always equal to one.
Next, we suppose that the input weighs are some values like this: phi 0 = 2,
phi 1 = -1, phi 2 = 0.3
So we have the input vector x consisting of one, two and five.
And the weight vector phi consisting of two minus one and zero point three, respectively.
Then the output up the neuron is f(x), which equals g(phi dot x).
OK.
Now let's try to complete this output f(x).
First, let z equal to phi dot x, which is equal to phi 0 x0 plus phi 1 x1 plus phi 2 x2
according to the example, this equals two times one plus minus one times 2 plus
0.3 times five, which is equal to one point five at the final.
So if we apply this value of z in the equation of g
we have g(phi dot x) equals one over one plus e to the minus z, which is approximately
0.82
Therefore, we say that the output of this neuron, given such inputs, is 0.82.
OK.
I hope that this a very simple example can give you the idea of designing an artificial neuron.
Indeed, if we use a logistic function as the activation of the neuron, then the neuron is no different
than what we have learned in logistic regression,
right?
But the beauty of neural network is that neurons are interconnected to make a powerful and processing network.
In the next lesson,
I will show you different neural network architectures from simple to complex.
Thank you for your listening.
