Welcome back. Let's continue a brief
example of back propagation, and how a
neural network can learn. In the last
video, we had the same perceptron we've
been using. Has - it has two neurons in the
input layer, for the features has
whiskers and is a good boy, and cat has
the value 1 0 for these features, dog has
the value 0 1 for these features. You
take the features, and you multiply them
by the weight of the connection between
the input layer and the output layer. So
the weight that connects the has
whiskers neuron to the output is 0.523,
and the weight that connects is a good
boy to the output is 0.342. You add up
the results of these multiplications, so
for cat is 1 multiplied by 0.523, plus
0 multiplied by 0.342, you
get a certain number here 0.523. You
pass it through an activation function
which is rounded to the nearest integer,
You get a 1, and then this is the output
that you wanted. You have a structure
that converts an input 1 0, to the output
1, it also converts the input 0 1 for dog
to the output 0, but we have to do this
in 2 epochs. First, we made a mistake and
then we use back propagation for the
network to learn what the correct
weights should be. So in this example,
we're going to go through multiple
epochs of training. We are going to feed
the values of the input forward to do
forward propagation, we're gonna measure
the deltas or the amount of error that
we get, we back propagate that error to
recalculate our weights, and then we're
gonna be doing this several times, and
until we get weights that actually work
for what we want. Your canvas website has
a PDF that you'll need for this exercise.
It's called backpropexample. Please pause the video, download that PDF,
and keep it open because you're going to
need it for your part
of the exercise. Please pause the video
and I'll give you a few seconds. Okay so
at this point you should be familiar
with how we would calculate the input -how we would go from the input layer to the
output layer. So what is the input cat
giving to the output layer, and also what
is the input dog giving to the output
layer? As a final reminder, it's the value
of each neuron multiplied by the weight
and then the summation of those. So give
it a try and try to calculate what's the
value that's gonna go from the input
layer to the output layer for cat and
for dog using those randomly generated
features. Pause the video.
These are the values for cat. It will
be 1 multiplied by 0.36, plus the feature
is a good boy which is 0, multiplied by
0.63 equals 0.36.
For dog, the value of the feature has
whiskers is 0, and the feature is a good
boy is 1, so it's 0 multiplied by 0.36
plus 1 multiplied by 0.63 equals 0.63.
So this is what gets transferred from
one layer to the other, to the next one,
so that's the value that you got and
then you have our activation function,
which is rounding to the nearest integer.
What is going to be the activation for
cat, and for dog, given the values
that we have. Please perform the
operation. Pause the video, and perform
the operation.
In fact, cat is gonna give you an
activation of zero, because it is less
than 0.5. 0 is the nearest integer and
dog is gonna give you an activation of
1, because the out - the value that goes
from the input layer to the output layer
is 0.63. And when you pass 0.63 to the activation
function, you get a 1 so as you can see,
this is not the output that we want. Both
of them failed. When we have cat, the
value that goes into the output layer is
0.36, and then when we pass it through
the activation function it becomes a 0,
but we wanted a 1 for cat.
Likewise for dog the value that we get
from these two when we added is 0.63, and
when we pass 0.63 through the activation
function, we get a 1, when what we wanted was a 0. So neither of them worked. So
what are we going to do? We're going to
calculate the error, calculate the deltas,
and then back propagate so that we can
readjust the weights. So please take a
moment to, I'm sorry, write these down the
output for cat and dog is 0.36 0.63.
Write them on a piece of paper, and then
with that please calculate the deltas.
We're going to use a very simple
function, it's the result that we got
before the activation function, minus the output
that we actually wanted. Please pause
the video and calculate the delta's on a
piece of paper, and we'll verify it when
you come back.
Welcome back. The delta for cat is minus 0.64
because ... let's see ... it is 0.36 which again is 1
multiplied by this, plus 0 multiplied by
this, we get a 0.36 for the - for what the
output layer received, and then the
output that we wanted was a 1 so 0.36,
minus 1, minus 0.64, that'll be the delta
for cat. For dog, it will be what the
output layer receives 0.63, which again
is 0 multiplied by 0.36, plus 1
multiplied by 0.63 equals 0.63 minus 0 which is the output that we
want, I'm sorry,
it's obviously 0.63. So these are two
deltas. So now please make sure you have
these on the piece of paper that you're
working on. The weights -
then the weights, what resulted from the
multiplication of the features on the
weights, so the value that we get at the
imp - and the output layer, and the deltas.
And with that, please try to calculate
the new weights, the new weights are the
old weights, for - for example for the first
neuron, I apologize this is the weight
that connects the first neuron and the
output layer. So this will be the old
weight, the one that we had minus a learning
factor, which is a learning rate, I'm
sorry, this is gonna be 0.1 multiplied by
the delta for cat,
multiplied by the input for cat in the
first neuron minus 0.1 for the learning
rate, multiplied by the delta for dog
multiplied by the
input for dog in the first neuron. So
please take a moment to go through that
arithmetic and pause the video.
There should be something like this: the
old weight was 0.36, learning rate 0.1,
the delta for cat is minus 0.64, the
input for cat in the first neuron is 1
and the input - this is the learning rate
0.1, the delta for dog is 0.63, and the
input for dog in the first neuron is
zero. So we have that this increases the
weight a little bit and this has no
participation. The increase of the weight
so 0.36 plus 0.064 is 0.42 which is
going to be our new weight for the
connection between this neuron in the
input layer and the output layer. So now
you go. I haven't calculate the weights
for the connection of the second neuron.
So this is the weight that connects the
this neuron in the input layer, the one
that receives is a good boy and the
output layer.
So what we're updating is this weight,
and we're gonna need some values here.
Please take a minute to do the
arithmetic.
So we have the old weight 0.63 minus the
learning rate 0.1, the delta for cat
minus 0.64, the learning rate 0.1, the
delta for dog 0.63, the input for dog in
the second neuron one, the input for do - for cat
in the second neuron, zero, and when we do
this, the weight for this connection is
going to be zero point five six seven. So
our new weights are zero point four two
four and zero point five six seven. We
have performed one epoch of training, and
we are now going to enter epoch two. Let's
continue. Let's see if the network works
or not. So again, take a minute to
calculate with the input for cat, what is
the value that's going to arrive here at
the beginning of the output layer. Please
refer to the PDF to see the formulas.
They're also there, but take this 
moment to familiarize yourself with the
formulas. Please calculate the input for
cat and for dog.
As you can see, it's something like this.
The value for cat, for the feature has
whiskers one, and for the feature is a
good boy zero, multiplied by this
weight - one multiplied by zero point
four two four, has whiskers and is a good
boy multiplied by zero point five six
seven is equal to zero point four two
four. For dog, 0 multiplied by, oh I'm
sorry, this should be .424. There we go. 0 multiplied by .424,
plus 1 multiplied by zero point five six
seven is equal to zero point five six
seven. As you can see, when we pass that
through the activation function, this is
still being rounded to zero, and this is
still being rounded to one. And our neural
network is still off, because this is
what we got. The threshold of the output
layer, zero point four two four, zero
point six - five six seven, this is from
two zero and one, but the output that we
wanted was one for cat and zero for dog.
So both are still failing. We've moved
forward one epoch, and we are getting
closer,
but it's still not there. So let's back
propagate again. Please calculate the
deltas. Here you have the weights and the
outputs. Please pause the video.
So you can see the delta for cat is minus
0.576 and the delta for dog is zero point
five six seven. Please right now write
down on a piece of paper or read it in
the PDF the values for the results at
the threshold of the output layer, the val -
the values of the output we want, and the
deltas, so that you can calculate this.
Please calculate the updated value for
the first weight. Let's update the weight
that connects the first neuron with the
output layer. This is the equation. Please
go ahead. Pause the video.
The value should be something like this,
zero point four eight one six. As you can
see from here. How about
the weight for the second neuron? Please
take a moment to calculate it. Pause the
video.
This is the value for the - the
connection of the second neuron, this
weight right here. As you can see, the
weight gets updated to zero point five
one zero three, so we're getting close. We
have finished epoch two, and now we have
these updated weights. Here the weight
that connects this neuron to the output
is zero point four eight one six, and the
weight that connects this neuron to the
output is zero point five one zero three.
You have all these results in the PDF.
Now I want you to try the whole
operation yourself before you move ahead
in the video. You need to do the forward
propagation, run those results through
the activation function, cal - compare the
output that you got with the output that
you wanted, calculate the error or the
deltas, back propagate the weights and
recalculate them, and at this point you
will have reached the end of one epoch,
and then you need to forward propagate
again to see if now your results are
correct. Please pause the video and come
back to see the results.
Let's see. So on epoch three, we have for cat, the value for the feature who has
whiskers one, multiplied by the weight
that links that neuron to the next layer
zero point four eight one six, plus the
value of the feature is a good boy
zero for cat, multiplied by the value
of the weight that links that neuron to
the output layer zero point five one
eight four, is zero point four eight one
six. For dog, it's zero multiplied by the
first weight, plus 1 multiplied by the
second weight equals zero point five one
zero three. When we run it to the
activation function, this one rounds down
to zero, this one rounds up to one, which
is still not what we want but we're
getting closer. We're getting so close.
Let's calculate the deltas. Zero point
four one eight was what we got on the
threshold or the output layer. What we
wanted was a 1, so the difference between
them is zero point five one eight four.
With dog,
what we got was zero point five one. We
wanted a zero, so the difference is 0.51.
The update of neuron one would be the
value of the old weight zero point four
eight, minus the learning rate zero point
one, the Delta for capped 0.51, plus the
input for cat in the first neuron one,
minus zero point one for the learning
rate, multiplied by the delta for dog 0.51, plus the input for dog in the first
neuron,
I'm sorry, multiplied by the input for
dog in the first neuron zero. This plus a
little bit minus nothing is zero point five
three three four four. This is the new
weight of the first - this is the new
value of the first weight, the weight
that connects this neuron for has
whiskers to the output layer. Likewise,
when you do the arithmetic for the
second weight, the one that connects the
second neuron to the output, you get zero
point four five nine two seven.
So the values have finally
flipped. Hopefully this will make it so
that things work. And they do! When you go
to the fourth epoch of forward propagation,
you have for the cat one x 0.53 plus 0
multiplied by zero point four five nine
two seven, zero point five three. When you
round this, it goes up to one, and then it
gets us the output that we wanted. For
dog, zero multiplied by the first weight
plus 1, multiplied by the second weight
zero point four five, this rounds down to zero. And this is
what we wanted. So after three epochs of
training, the network has converged on to
the output that we wanted. It started by
giving us results with errors, but we
have finally made the weights converge
so that the structure is giving us the
output that we want. It's transforming 1
and 0 into 1 and 0 and 1 into 0.
In summary, look at what we did. We had
some object from the world, a cat and a
dog, we transformed those into a system
of features. We use two silly features,
has whiskers, and is a good boy, and then
we determined the value of each feature
for each of our items. Cats have whiskers
one, is a good boy 0, (cats are good boys,
of course, but it's for the example!) For
dogs, has whiskers 0, is a good boy 1. We
also have labels, so for example one is a
cat,
represents a cat, and 0 represents a dog.
We have this system of features as our
input and then we run it through this
perceptron that has an input layer, it
has weights connecting the layers, it has
an output layer where you also run the
results to an activation function that
rounds these to 1 and 0, precisely
because the values that we want are
either 1 or 0. We went through several
epochs of training so that we started
with the weight 0.36 and 0.63,
and after one two and three epochs of
training, our weights changed so that our
output was correct. This is a simple, a
single layer perceptron with
backpropagation. So single layer
perceptrons were invented in the 1940s
and 50s, and back propagation was
invented in the 1980s, and together they
form the backbones of neural networks. This is
the most basic type of neural network
that we can have. It takes some set of
features as an input, it performs some
multiplication and then per - and then
gives you an output to classify
something, perform other operations as
we'll see. They're incredibly useful but
also really strange. What does the
structure really know about cats and
dogs? It's just a bunch of numbers and
multiplications if you think about it.
It's a strange trick. Sometimes works, but
when it doesn't, it's really difficult to
try to figure out what's going on. So all
of these topics are going to be the
matter of this week's topic, neural
networks, deep learning and natural
language processing.
