hello everyone welcome to the semicolon
in this tutorial we are going to learn about
neural networks and the backpropagation algorithm
so in the last tutorial we have learned
about perceptron what you are seeing in front
of you is a sigmoid perceptron and the
sigmoid equation is this Sigmoid of X is 1
by 1 plus e power minus X so what it does
is it contains the sum to 0 to 1 so this is
how a sigmoid perceptron works and the
backpropagation algorithm is used to
update the neural networks and in this
tutorial we're going to applying this
neural networks to our MNIST dataset
using the scikit learn library
let's get started now we've seen this
sigmoid perceptron when we use many
sigmoid perceptrons together we get 
this our artificial neural network so here
you can see that this is a lot of
perceptrons connected together, say you
ignore this for a while you can see a one
connected with 1 x1 x2 x3 and xn is a
perceptron in itself and then this one 1
a1 a2 x and f(x) from another perceptron and
similarly a2 is a perceptron with all
these input and AK everything is a perceptron
So neural networks are a combinations of lot
of prespectrons I'm sorry perceptrons yeah got it 
right so now you can see 
that a neural network has layers so whatever is
in middle is called the hidden layer and
whatever is the first layer is called
the input layer and this is called the
output layer and each layer has some
nodes which are connected through every
node of every other layer so as you can
see this node is connected to all the
nodes except the bias of course x 1 is
connected to all the notes even a1 a2
and so on upto ak
and same is the case first for
everything
so let's see how neural networks work in
the perceptron we've seen that whenever
we get the output we calculate the error
from it and update all these weights so how
we move it to a neural network is this we
find the error and then that error leads us to
finding the error with these and these
errors are responsible for updating
weights in this layer in these connections
and if these had a previous layer we will
do the same and calculate error for the
previous layer as well so as this is
propagating backwards
this is called the backpropagation
algorithm and this is how neural networks
learn so let's take an example to
understand it a bit more better say our
inputs are like this our inputs are like
this
so what happens is it gets multiplied by
the weights everyone every single one and
then we get a value for a1 when this
passes through sigmoid function so say
we get some value 2.6 are anything so
sigmoid contains it between 0 to 1 so we
can't get 2.6 it so let's say  0.1
we get point 1 and point 3 for this and
point 5 for this and after this we after
applying this again
calculating the summation applying here
say we get point five but the actual
result was supposed to be point four so
our error is point one so we take this
point one update these weights here and
then update these values using these
weights and calculating the error in
this value needs us to updating these
weights and that is how we train our
neural network so that we get the least 
error possible so let's look at the
gradient descent algorithm (Backpropogation) so what we do
is we calculate delts k for hidden units
like this this is the
error this is the output 1 minus output
and for hidden layers because we cannot
have output for hidden layers if you recall
you'll get that Sigma weight into delta k was the
output of the perceptron and that is how
we calculate delta h the hidden layer
output and put it here so this is how we
update our equations and now let's see
how we apply the neural network using the
scikit learn library so this is the this is
the library which you have to import you
have to use a conda update scikit-learn before
we use this because this is present in
the latest version if you have scikit 
learn scikit learn version 0.18 or 1.8 this
would work i don't remember exactly the
version but this should work with the
latest version and when you bring the
latest version in even this gets depreciated
depreciated so you'll need this
scikit learn.model_selection to do test train
split and this is the same code which we
used in the last tutorials so the
regular training test splitting and instead
of this perceptron we would like to
change it to the code for the  our
neural network so let's see how it looks
lets rename it at first
let nn be our neural network
so what we write here is MLPClassifier
classifier
and we have to set the activation for it
the activation for it can be relu
as well as Stocastic gradient
descent so we'll be using stocastic 
gradient descent I'm sorry the
activation can be Sigmoid which is called
logistic as well as relu as well as tanh
well as tanh so the best performance
was given by logistic when I tested it so
i'll be using logistic the solver can be
a gradient descent SGD then the value of
alpha which would be default then the
size of hidden layers so hidden layer
sizes the different size you give the
training time will vary upon that so we
can give a huge size like 45 to 90 which
would take the time to train so what I
do is I take 10 and 15 as the training
size i guess as the size increases the
results will become better because of
more variables and random state is one
so the default learning rate is point
zero zero zero one and i think it would
be better if we kept it at that because
he are having the small thing
so let's execute this there was an error
here
so it has a error in hidden layer size 
let's check about that I guess this should
be sizes and we're done and let's fit it
now this fitting takes a lot of time
that depends on your processor and the
the scikit learn doesn't support GPU so it won't
depend on your GPU it will depend on
your ram as well it takes a lot of memory
it depends on the number of training
data as well so it takes a huge time to
train
let's wait for it so as we see now our
neural network is trained so as you can see the
activation is logistic alpha is
point zero zero zero one and the hidden
their sizes is this learning rate is
constant which can be changed the
learning rate is by default this point 0
0 1 which can be changed as well and
when you change it you will get the results
faster or slower depending on how you
increase or decrease it and now let's
predict the values pred is equal to let's
predict the value on X I'm sorry x test
and these are our actual values
it's the same thing which you have been
doing for the old tutorials you can see the
accuracy decrease pretty much as you can
see from the perceptron but the thing is
we are using very little size for our
hidden layers which affect the accuracy
very much
now the accuracy is a 81% but
that doesn't mean our neural network is bad
because as you increase size
say 150 100 to 150 because the previous input
is 180 and sizing it down to 10 layers
is bad and if you do this it will take a
bit more time to train but then you get
better accuracy and at some point of
time you can even surpass the accuracy
of random forest which we got was
somewhere around ninety-six percent so this
is it for our neural network guys hit the
like button if you found it helpful
subscribe and share it with your friends
thank you
