 
 
 
Hello everyone! Welcome back to Neural Network Lectures. In the previous
we have learned about Hebbian Learning and also
I introduced you to Competitive Learning right?
In this lecture, we'll be continuing the discussion on Competitive Learning
and also we will learn about Boltzmann Learning.
So those who haven't seen the intro on Competitive
Learning in the previous video, I strongly recommend you to
watch that before coming here so that it'll be easy for you
to grasp the ideas being discussed here.
Having said that, lets start our lecture.
In Competitive Learning as the name implies,
the output neurons compete among themselves to be
active i.e at any point of time
only one of these output neurons will be active.
This is in contrast to the case of Hebbian learning where multiple
output neurons may be active at the same time. So,
this is a very distinguishing feature of Competitive Learning.
Always remember that. I have discussed about these
in a detailed manner in the last lecture. So if you have any doubts,
Please refer that okay?
Now lets move on to the basic elements of
Competitive Learning. There are mainly three
basic elements to Competitive Learning. The first one is:
 
 
 
what it means is that, all the
input neurons are structurally similar. The only
difference among them is that
The synaptic weights at the beginning or initial
synaptic weights of these neurons will be different. They will be
randomly distributed. Now the second point is:
 
 
what it means is that, some constraint
is put on the synaptic weights of each neuron.
For example, it can be like this:
 
 
or it can be like this:
 
These constraints put on the synaptic weights
depends on the environment as well as the
designer of the neural network.
Now coming to the third point:
 
 
It simply means that only one of the output neurons
will be active at a time and
for being the neuron to be active, the neurons
compete among themselves.
These are the three simple criteria on Competitive Learning.
Now we will talk about the network architecture of
Competitive Learning. In Competitive Learning,
all the input neurons are connected to all the output neurons
also there are connections between the
output neurons. One important thing to note is that
all the connections between the inputs and the outputs that is
these connections
are excitatory in nature while
the connections between the output neurons that is
these connections are inhibitory
in nature.
Why?
Because the output neurons are competing among themselves right? For that
each output neuron should try to suppress the output
of other neurons. That is the reason why these connections between the
output neurons are inhibitory in nature.
These connections are called lateral connections.
Now how will we choose the
winning neuron?
 
 
 
 
I hope that you remember this.
 
 
 
 
 
Mathematically we can write it as:
 
 
 
 
 
 
Now suppose that k is a winning neuron
i.e yk equal to 1, all the
synaptic weights connected to neuron k will be updated
limited to some constraints like i have told you before
i.e
 
 
However the important thing to note is that
the synaptic weights connected to other neurons
i.e the loser neurons or the neurons
that didn't win the competition are not updated.
Let me emphasize that i.e
The synaptic weights of other
output neurons are not updated.
This is how the neural network learns
and the change in the synaptic weights,
for the winning neuron is given by:
 
 
 
 
In all other cases, it is zero i.e
when neuron k loses the
competition.
 
 
and this rule for the updation of synaptic weights is called
Competitive Learning Rule.
This rule, as an overall
effect of moving the synaptic weight vector  Wk
of the winning neuron k towards the input pattern x.
What i meant is,
suppose we have an input vector X
which is given by:
 
 
and we have a synaptic weight vector Wk
given by:
 
 
Over the time, as the learning progresses, this
weight vector Wk will approach
the input vector X.
That is how the output neurons become feature
detectors. From this point, that the synaptic
weight vector will approach the input vector,
we can infer that the input vector is also
under some constraints. For example, lets say that
the weight vector Wkj is
under constraint that the summation is 1.
Then there is a possibility that the input vector
is also under the same constraint.
To make this point more clear, let me take a two dimensional vector.
i.e
 
 
 
 
 
Now, let the synaptic weights be under the constraint:
 
So the synaptic weight vector will move around a unit circle
in a 2D plane. Therefore let me draw that.
Now let these dots represent
the natural grouping of input patterns.
 
Let the initial state of the synaptic
weight vector be like this.
Let these crosses represent the
synaptic weight vectors before the training.
As the training progresses, the synaptic
weight vectors will move towards
its natural groupings.
So the final state of this
cross will be over here,
this will be over here and this will be over
here. So it can be said that
each output neuron
has discovered a cluster of input patterns
by moving its synaptic weight vector to the
center of gravity of discovered cluster.
Now if we try to implement this
on a 3D plane,
let the clustering of inputs be like this,
like this
and like this. The initial state of
the synaptic vector may be like this, like
this or like this and over the time, due to learning,
this synaptic weight vectors
will approach
a clustering of inputs.
So the final state after learning will be, the synaptic
weight vectors will be at the center of these
groupings of inputs. Thus for
example, this particular output neuron has specialized
in detecting this
group of inputs. Similarly this one
has specialized in detecting this group of inputs
and this one, on detecting this kind of inputs.
Now consider the case that there is 2
groupings
 
that are very close to each other and overlap.
Now, suppose that a synaptic weight vector
has approached this first tree.
Let this be over here. However, due to this
overlapping of inputs,
the synaptic weight vector wont be stable.
It will begin to oscillate between these two clusterings
that is these two input clusterings, making the system
unstable. This is a shortcoming of Competitive Learning.
So keep that in mind.
Now let me try to show you how a neural network
learns to play a game.
Lets take the classic example of Mario.
So, the input vector to the Mario may be
the pixels in front of Mario. For example,
let the input vectors be the pixels here
and Mario has only three
outputs, i.e
moving forward, moving backward
and jumping up. I hope that you all remember the
moves in Mario i.e it can only move forward or backward or
jump up. Therefore three output neurons are given for
each of the functions. Now the neural network
may not be provided with the rules of Mario. However
for each game played, the neural network will be provided
with a score based on its performance.
So the game starts with random weights and
in the initial few games, the neural network may not even move the
Mario. The Mario will simply stand there. But
overtime, the neural network will learn that
its performance score will increase if the
Mario moves forward.
So the synaptic weights will be adjusted in such a way that
if there is an input vector consisting of this kind of pixels
i.e there are no obstacles.
i.e in this set of input pixels,
there is no obstacles in front of Mario. So
the neural network will move Mario forward.
By the way, keep in mind that, as Mario moves forward,
its input vectors keep on changing. For
example, in this place these are the input pixels and
these are the input pixels over here and this is the input
pixel over here.
This is the input pixel over here. Note that
as Mario reaches over here, the input pixel
or input vector will show an obstacle.
Initially the neural network may not know that it should jump over
the obstacle or jump over the mushroom.
However with many training iterations,
it will learn that, if the Mario jumps over the
mushroom, the performance score of the neural network will increase.
So,
the neural network will learn that if this particular set of input
pattern comes, it should jump.
Similarly there might come an instance when the Mario should move
backward.
Likewise over many generations of training
or the number of times the neural network will play to learn the game will be
in the order of ten thousands or in millions.
Eventually it will learn what to do
 
when it confronts a particular input pattern.
This is how it will learn
and thus finally the neural network will learn to
win the game.
This is a very crude example and i hope that you all understood this.
If not there are many videos in YouTube which shows
how neural networks learn to play a game. Please try to watch that.
So that you can learn how Competitive Learning is applied in
real world.
Having said that much, lets move onto the next learning rule that is
Boltzmann Learning. Boltzmann Learning rule is a stochastic
learning algorithm derived from statistical
mechanics. Let me write that down first.
Now its stochastic in the sense that, it is
non deterministic.
The second point to note is that
the neurons in Boltzmann Learning constitute
a recurring structure.
What i meant by this statement is that, there are feedbacks
between the interconnected neurons.
Now the third point to note is that,  the neurons
in Boltzmann Learning network operate in binary mode.
That means they take either +1 or
-1 states.
Now we can characterize the Boltzmann Learning Machine
by an energy function given by:
 
 
 
 
and is iterated over k and
j. I hope that you all remember the
definition i.e xj is the input to the
neuron j, xk is the  input to the neuron k
and Wkj is the synaptic weight from
neuron j to k.
Also this equation is
restricted to the constraint that j not equal to k.
What it means is that,
there are no
self feedbacks.
Now the question arises, in what way
the Boltzmann Learning is stochastic?
For example, lets take a Boltzmann network.
These are the output neurons and all
these neurons are interconnected.
like this.
 
Now we can take the outputs
 
from any of these neurons
or rather we can say that
the output pattern is defined as a
vector of states of these neurons.
But in a more generalized Boltzmann Learning network,
there may be some other neurons also
in the network.
And these neurons doesn't take part in the process of output. However
these may be interconnected with other
neurons.
Since
these neurons doesn't take part in the output process,
we can say that it is a hidden layer right?
In the sense that we don't specify their states in the output.
But all these neurons take part in the
output process.
Basically we can divide the neuron set into two.
i.e the ones that takes part in the output
and the ones that doesn't take part in the output.
The ones that take part in the output, let me
mark them,
are these right?
 
We can call these neurons as
visible neurons.
and the ones that doesn't take part in the output
i.e these ones can be called as
hidden neurons.
So what basically the Boltzmann Learning rule says is that
the energy function for a particular state of
network is given by
this equation.
This is an important equation.
Now suppose that some
of these neurons are +1 i.e the state of
these neurons like this.
 
Now suppose
I take this one particular neuron and flips its
output state i.e I changed its output state
to +1. What does it mean? It means that
the energy state of the network changes i.e this
energy function changes. Now
what it means in the stochastic sense is that,
what is the probability that the state of the picked up neuron
will be flipped?
i.e we find the probability P that
a neuron k
changes its state from
xk to -xk
and this is given by the expression:
 
 
 
 
is the change in energy function as the neuron
flip from +xk to
-xk.
and T is the
pseudo-temperature.
It basically represents the noise in the system.
So in Boltzmann Learning, what the network tries to
achieve is that, all the time the network will
reach a thermal equilibrium i.e the network will reach a
steady state.
Another thing about Boltzmann Learning is that, we have already
classified the output neurons into
visible and
hidden neurons. Now a question
that might arise is that, are we able to change
or flip the output states of all
visible neurons? Perhaps
we can, perhaps we cannot. The case where we
cant change the or we cant flip the output state is
when we put some constraints on the states of visible neurons
i.e if we clamp them.
Based on that, we can again classify the visible neurons
into two categories that is,
clamped
and
free running.
So in the case of clamped,
the output neurons are clamped into specific
states determined by the environment.
While in the case of free running, all the
neurons are allowed to operate freely.
Now when it comes to the case of
hidden neurons, the question of clamping doesn't exist.
Why? Because the hidden neurons does not take part in the
outputs right?
Therefore this is always in free running conditions.
Now we need to specify
how the weights are updated in Boltzmann Learning.
Before going on to that, I need all of you to
get familiarized with these notations. i.e
 
 
 
 
 
 
 
 
can take values of +1 and
-1 depending on the correlation.
According to Boltzmann Learning rule:
 
 
 
 
 
 
 
and this equation
is called
Boltzmann Learning Rule.
Since it is an introductory course,
I am not going much into the details of Boltzmann Learning.
Now in the next lecture we'll be discussing about
how to implement logic gates using
McCulloch - Piits model and Hebbian Learning mechanism.
So if you understood this lecture, please like the video
and also please subscribe to the channel.
If you do have any doubts, please do ask in the comment section.
Thanks for watching and have a nice day:)
