The following content is
provided under a Creative
Commons license.
Your support will help MIT
OpenCourseWare continue to offer
high quality educational
resources for free.
To make a donation or to view
additional materials from
hundreds of MIT courses,
visit MIT OpenCourseWare at
ocw.mit.edu.
so -- OK, so remember last
time,
on Tuesday we learned about the
chain rule,
and so for example we saw that
if we have a function that
depends,
sorry, on three variables,
x,y,z,
that x,y,z themselves depend on
some variable,
t,
then you can find a formula for
df/dt by writing down wx/dx dt
wy dy/dt wz dz/dt.
And, the meaning of that
formula is that while the change
in w is caused by changes in x,
y, and z, x,
y, and z change at rates dx/dt,
dy/dt, dz/dt.
And, this causes a function to
change accordingly using,
well, the partial derivatives
tell you how sensitive w is to
changes in each variable.
OK, so, we are going to just
rewrite this in a new notation.
So, I'm going to rewrite this
in a more concise form as
gradient of w dot product with
velocity vector dr/dt.
So, the gradient of w is a
vector formed by putting
together all of the partial
derivatives.
OK, so it's the vector whose
components are the partials.
And, of course,
it's a vector that depends on
x, y, and z, right?
These guys depend on x, y, z.
So, it's actually one vector
for each point,
x, y, z.
You can talk about the gradient
of w at some point,
x, y, z.
So, at each point,
it gives you a vector.
That actually is what we will
call later a vector field.
We'll get back to that later.
And, dr/dt is just the velocity
vector dx/dt,
dy/dt, dz/dt.
OK, so the new definition for
today is the definition of the
gradient vector.
And, our goal will be to
understand a bit better,
what does this vector mean?
What does it measure?
And, what can we do with it?
But, you see that in terms of
information content,
it's really the same
information that's already in
the partial derivatives,
or in the differential.
So, yes, and I should say,
of course you can also use the
gradient and other things like
approximation formulas and so
on.
And so far, it's just notation.
It's a way to rewrite things.
But, so here's the first cool
property of the gradient.
So, I claim that the gradient
vector is perpendicular to the
level surface corresponding to
setting the function,
w, equal to a constant.
OK, so if I draw a contour plot
of my function,
so, actually forget about z
because I want to draw a two
variable contour plot.
So, say I have a function of
two variables,
x and y, then maybe it has some
contour plot.
And, I'm saying if I take the
gradient of a function at this
point, (x,y).
So, I will have a vector.
Well, if I draw that vector on
top of a contour plot,
it's going to end up being
perpendicular to the level
curve.
Same thing if I have a function
of three variables.
Then, I can try to draw its
contour plot.
Of course, I can't really do it
because the contour plot would
be living in space with x,
y, and z.
But, it would be a bunch of
level faces, and the gradient
vector would be a vector in
space.
That vector is perpendicular to
the level faces.
So, let's try to see that on a
couple of examples.
So, let's do a first example.
What's the easiest case?
Let's take a linear function of
x, y, and z.
So, I will take w equals a1
times x plus a2 times y plus a3
times z.
Well, so, what's the gradient
of this function?
Well, the first component will
be a1.
That's partial w partial x.
Then, a2, that's partial w
partial y, and a3,
partial w partial z.
Now, what is the levels of this?
Well, if I set w equal to some
constant, c, that means I look
at the points where a1x a2y a3z
equals c.
What kind of service is that?
It's a plane.
And, we know how to find a
normal vector to this plane just
by looking at the coefficients.
So, it's a plane with a normal
vector exactly this gradient.
And, in fact,
in a way, this is the only case
you need to check because of
linear approximations.
If you replace a function by
its linear approximation,
that means you will replace the
level surfaces by their tension
planes.
And then, you'll actually end
up in this situation.
But maybe that's not very
convincing.
So, let's do another example.
So, let's do a second example.
Let's say we look at the
function x^2 y^2.
OK, so now it's a function of
just two variables because that
way we'll be able to actually
draw a picture for you.
OK, so what are the level sets
of this function?
Well, they're going to be
circles, right?
w equals c is a circle,
x^2 y^2 = c.
So, I should say,
maybe, sorry,
the level curve is a circle.
So, the contour plot looks
something like that.
Now, what's the gradient vector?
Well, the gradient of this
function, so,
partial w partial x is 2x.
And partial w partial y is 2y.
So, let's say I take a point,
x comma y, and I try to draw my
gradient vector.
So, here at x,
y, so, I have to draw the
vector, <2x,
2y>.
What does it look like?
Well, it's going in that
direction.
It's parallel to the position
vector for this point.
It's actually twice the
position vector.
So, I guess it goes more or
less like this.
What's interesting,
too, is it is perpendicular to
this circle.
OK, so it's a general feature.
Actually, let me show you more
examples, oops,
not the one I want.
So, I don't know if you can see
it so well.
Well, hopefully you can.
So, here I have a contour plot
of a function,
and I have a blue vector.
That's the gradient vector at
the pink point on the plot.
So, you can see,
I can move the pink point,
and the gradient vector,
of course, changes because the
gradient depends on x and y.
But, what doesn't change is
that it's always perpendicular
to the level curves.
Anywhere I am,
my gradient stays perpendicular
to the level curve.
OK, is that convincing?
Is that visible for people who
can't see blue?
OK, so, OK, so we have a lot of
evidence, but let's try to prove
the theorem because it will be
interesting.
So, first of all,
sorry, any questions about the
statement, the example,
anything, yes?
Ah, very good question.
Does the gradient vector,
why is the gradient vector
perpendicular in one direction
rather than the other?
So, we'll see the answer to
that in a few minutes.
But let me just tell you
immediately, to the side,
which side it's pointing to,
it's always pointing towards
higher values of a function.
OK, and we'll see in that maybe
about half an hour.
So, well, let me say actually
points towards higher values of
w.
OK, any other questions?
I don't see any questions.
OK, so let's try to prove this
theorem, at least this part of
the theorem.
We're not going to prove that
just yet.
That will come in a while.
So, well, maybe we want to
understand first what happens if
we move inside the level curve,
OK?
So, let's imagine that we are
taking a moving point that stays
on the level curve or on the
level surface.
And then, we know,
well, what happens is that the
function stays constant.
But, we can also know how
quickly the function changes
using the chain rule up there.
So, maybe the chain rule will
actually be the key to
understanding how the gradient
vector and the motion on the
level service relate.
So, let's take a curve,
r equals r of t,
that stays inside,
well, maybe I should say on the
level surface,
w equals c.
So, let's think about what that
means.
So, just to get you used to
this idea, I'm going to draw a
level surface of a function of
three variables.
OK, so it's a surface given by
the equation w of x,
y, z equals some constant,
c.
And, so now I'm going to have a
point on that,
and it's going to move on that
surface.
So, I will have some parametric
curve that lives on this
surface.
So, the question is,
what's going to happen at any
given time?
Well, the first observation is
that the velocity vector,
what can I say about the
velocity vector of this motion?
It's going to be tangent to the
level surface,
right?
If I move on a surface,
then at any point,
my velocity is tangent to the
curve.
But, if it's tangent to the
curve, then it's also tangent to
the surface because the curve is
inside the surface.
So, OK, it's getting a bit
cluttered.
Maybe I should draw a bigger
picture.
Let me do that right away here.
So, I have my level surface,
w equals c.
I have a curve on that,
and at some point,
I'm going to have a certain
velocity.
So, the claim is that the
velocity, v,
equals dr/dt is tangent -- --
to the level,
w equals c because it's tangent
to the curve,
and the curve is inside the
level,
OK?
Now, what else can we say?
Well, we have,
the chain rule will tell us how
the value of w changes.
So, by the chain rule,
we have dw/dt.
So, the rate of change of the
value of w as I move along this
curve is given by the dot
product between the gradient and
the velocity vector.
And, so, well,
maybe I can rewrite it as w dot
v, and that should be,
well, what should it be?
What happens to the value of w
as t changes?
Well, it stays constant because
we are moving on a curve.
That curve might be
complicated, but it stays always
on the level,
w equals c.
So, it's zero because w of t
equals c, which is a constant.
OK, is that convincing?
OK, so now if we have a dot
product that's zero,
that tells us that these two
guys are perpendicular.
So -- So if the gradient vector
is perpendicular to v,
OK, that's a good start.
We know that the gradient is
perpendicular to this vector
tangent that's tangent to the
level surface.
What about other vectors
tangent to the level surface?
Well, in fact,
I could use any curve drawn on
the level of w equals c.
So, I could move,
really, any way I wanted on
that surface.
In particular,
I claim that I could have
chosen my velocity vector to be
any vector tangent to the
surface.
OK, so let's write this.
So this is true for any curve,
or, I'll say for any motion on
the level surface,
w equals c.
So that means v can be any
vector tangent to the surface
tangent to the level.
See, for example,
OK, let me draw one more
picture.
OK, so I have my level surface.
So, I'm drawing more and more
levels, and they never quite
look the same.
But I have a point.
And, at this point,
I have the tangent plane to the
level surface.
OK, so this is tangent plane to
the level.
Then, if I choose any vector in
that tangent plane.
Let's say I choose the one that
goes in that direction.
Then, I can actually find a
curve that goes in that
direction, and stays on the
level.
So, here, that would be a curve
that somehow goes from the right
to the left, and of course it
has to end up going up or
something like that.
OK, so given any vector tangent
-- -- let's call that vector v
tangent to the level,
we get that the gradient is
perpendicular to v.
So, if the gradient is
perpendicular to this vector
tangent to this curve,
but also to any vector,
I can draw that tangent to my
surface.
So, what does that mean?
Well, that means the gradient
is actually perpendicular to the
tangent plane or to the surface
at this point.
So, the gradient is
perpendicular.
And, well, here,
I've illustrated things with a
three-dimensional example,
but really it works the same if
you have only two variables.
Then you have a level curve
that has a tangent line,
and the gradient is
perpendicular to that line.
OK, any questions?
No?
OK, so, let's see.
That's actually pretty neat
because there is a nice
application of this,
which is to try to figure out,
now we know,
actually, how to find the
tangent plane to anything,
pretty much.
OK, so let's see.
So, let's say that,
for example,
I want to find -- -- the
tangent plane -- -- to the
surface with equation,
let's say, x^2 y^2-z^2 = 4 at
the point (2,1,
1).
Let me write that.
So, how do we do that?
Well, one way that we already
know,
if we solve this for z,
so we can write z equals a
function of x and y,
then we know tangent plane
approximation for the graph of a
function,
z equals some function of x and
y.
But, that doesn't look like
it's the best way to do it.
OK, the best way to it,
now that we have the gradient
vector, is actually to directly
say, oh, we know the normal
vector to this plane.
The normal vector will just be
the gradient.
Oh, I think I have a cool
picture to show.
OK, so that's what it looks
like.
OK, so here you have the
surface x2 y2-z2 equals four.
That's called a hyperboloid
because it looks like when you
get when you spin a hyperbola
around an axis.
And, here's a tangent plane at
the given point.
So, it doesn't look very
tangent because it crosses the
surface.
But, it's really,
if you think about it,
you will see it's really the
plane that's approximating the
surface in the best way that you
can at this given point.
It is really the tangent plane.
So, how do we find this plane?
Well, you can plot it on a
computer.
That's not exactly how you
would look for it in the first
place.
So, the way to do it is that we
compute the gradient.
So, a gradient of what?
Well, a gradient of this
function.
OK, so I should say,
this is the level set,
w equals four,
where w equals x^2 y^2 - z^2.
And so, we know that the
gradient of this,
well, what is it?
2x, then 2y,
and then negative 2z.
So, at this given point,
I guess we are at x equals two.
So, that's four.
And then, y and z are one.
So, two, negative two.
OK, and that's going to be the
normal vector to the surface or
to the tangent plane.
That's one way to define the
tangent plane.
All right, it has the same
normal vector as the surface.
That's one way to define the
normal vector to the surface,
if you prefer.
Being perpendicular to the
surface means that you are
perpendicular to its tangent
plane.
OK, so the equation is,
well, 4x 2y-2z equals
something, where something is,
well, we should just plug in
that point.
We'll get eight plus two minus
two looks like we'll get eight.
And, of course,
we could simplify dividing
everything by two,
but it's not very important
here.
OK, so now if you have a
surface given by an evil
equation,
and a point on the surface,
well, you know how to find the
tangent plane to the surface at
that point.
OK, any questions?
No.
OK, let me give just another
reason why, another way that we
could have seen this.
So, I claim,
in fact, we could have done
this without the gradient,
or using the gradient in a
somehow disguised way.
So, here's another way.
So, the other way to do it
would be to start with a
differential,
OK?
dw, while it's pretty much the
same content,
but let me write it as a
differential,
dw is 2xdx 2ydy-2zdz.
So, at a given point,
at (2,1, 1),
this is 4dx 2dy-2dz.
Now, if we want to change this
into an approximation formula,
we can.
We know that the change in w is
approximately equal to 4 delta x
2 delta y - 2 delta z.
OK, so when do we stay on the
level surface?
Well, we stay on the level
surface when w doesn't change,
so, when this becomes zero,
OK?
Now, what does this
approximation sign mean?
Well, it means for small
changes in x,
y, z, this guy will be close to
that guy.
It also means something else.
Remember, these approximation
formulas, they are linear
approximations.
They mean that we replace the
function, actually,
by some closest linear formula
that will be nearby.
And so, in particular,
if we set this equal to zero
instead of approximately zero,
it means we'll actually be
moving on the tangent plane to
the level set.
If you want strict equalities
in approximations means that we
replace the function by its
tangent approximation.
So -- [APPLAUSE] OK,
so the level corresponds to
delta w equals zero,
and its tangent plane
corresponds to four delta x plus
two delta y minus two delta z
equals zero.
That's what I'm trying to say,
basically.
And, what's delta x?
Well, that means it's the
change in x.
So, what's the change in x here?
That means, well,
we started with x equals two,
and we moved to some other
value, x.
So, that's actually x- 2, right?
That's how much x has changed
compared to 2.
And, two times (y - 1) minus
two times z - 1 = 0.
That's the equation of a
tangent plane.
It's the same equation as the
one over there.
These are just two different
methods to get it.
OK, so this one explains to you
what's going on in terms of
approximation formulas.
This one goes right away,
by using the gradient factor.
So, in a way,
with this one,
you don't have to think nearly
as much.
But, you can use either one.
OK, questions?
No?
OK, so let's move on to new
topic, which is another
application of a gradient
vector, and that is directional
derivatives.
OK, so let's say that we have a
function of two variables,
x and y.
Well, we know how to compute
partial w over partial x or
partial w over partial y,
which measure how w changes if
I move in the direction of the x
axis or in the direction of the
y axis.
So, what about moving in other
directions?
Well, of course,
we've seen other approximation
formulas and so on.
But, we can still ask,
is there a derivative in every
direction?
And that's basically,
yes, that's the directional
derivative.
OK, so these are derivatives in
the direction of I hat or j hat,
the vectors that go along the x
or the y axis.
So, what if we move in another
direction, let's say,
the direction of some unit
vector, let's call it u .
OK, so if I give you a unit
vector, you can ask yourself,
if I move in the direction,
how quickly will my function
change?
So -- So, let's look at the
straight trajectory.
What this should mean is I
start at some value,
x, y, and there I have my
vector u.
And, I'm going to move in a
straight line in the direction
of u.
And, I have the graph of my
function -- -- and I'm asking
myself how quickly does the
value change when I move on the
graph in that direction?
OK, so let's look at a straight
line trajectory So,
we have a position vector,
r, that will depend on some
parameter which I will call s.
You'll see why very soon,
in such a way that the
derivative is this given unit
vector u hat.
So, why do I use s for my
parameter rather than t.
Well, it's a convention.
I'm moving at unit speed along
this line.
So that means that actually,
I'm parameterizing things by
the distance that I've traveled
along a curve,
sorry, along this line.
So, here it's called s in the
sense of arc length.
Actually, it's not really an
arc because it's a straight
line, so it's the distance along
the line.
OK, so because we are
parameterizing by distance,
we are just using s as a
convention just to distinguish
it from other situations.
And, so, now,
the question will be,
what is dw/ds?
What's the rate of change of w
when I move like that?
Well, of course we know the
answer because that's a special
case of the chain rule.
So, that's how we will actually
compute it.
But, in terms of what it means,
it really means we are asking
ourselves,
we start at a point and we
change the variables in a
certain direction,
which is not necessarily the x
or the y direction,
but really any direction.
And then, what's the derivative
in that direction?
OK, does that make sense as a
concept?
Kind of?
I see some faces that are not
completely convinced.
So, maybe you should show more
pictures.
Well, let me first write down a
bit more and show you something.
So I just want to give you the
actual definition.
Sorry, first of all in case you
wonder what this is all about,
so let's say the components of
our unit vector are two numbers,
a and b.
Then, it means we'll move along
the line x of s equals some
initial value,
the point where we are actually
at the directional derivative
plus s times a,
or I meant to say plus a times
s.
And, y of s equals y0 bs.
And then, we plug that into w.
And then we take the derivative.
So, we have a notation for that
which is going to be dw/ds with
a subscript in the direction of
u to indicate in which direction
we are actually going to move.
And, that's called the
directional derivative -- -- in
the direction of u.
OK, so, let's see what it means
geometrically.
So, remember,
we've seen things about partial
derivatives,
and we see that the partial
derivatives are the slopes of
slices of the graph by vertical
planes that are parallel to the
x or the y directions.
OK, so, if I have a point,
at any point,
I can slice the graph of my
function by two planes,
one that's going along the x,
one along the y direction.
And then, I can look at the
slices of the graph.
Let me see if I can use that
thing.
So, we can look at the slices
of the graph that are drawn
here.
In fact, we look at the tangent
lines to the slices,
and we look at the slope and
that gives us the partial
derivatives in case you are on
that side and want to see also
the pointer that was here.
So, now, similarly,
the directional derivative
means, actually,
we'll be slicing our graph by
the vertical plane.
It's not really colorful,
something more colorful.
We'll be slicing things by a
plane that is now in the
direction of this vector,
u, and we'll be looking at the
slope of the slice of the graph.
So, what that looks like here,
so that's the same applet the
way that you've used on your
problem set in case you are
wondering.
So, now, I'm picking a point on
the contour plot.
And, at that point,
I slice the graph.
So, here I'm starting by
slicing in the direction of the
x axis.
So, in fact,
what I'm measuring here by the
slope of the slice is the
partial in the x direction.
It's really partial f partial
x, which is also the directional
derivative in the direction of
i.
And now, if I rotate the slice,
then I have all of these
planes.
So, you see at the bottom left,
I have the direction in which
I'm going.
There's this,
like, rotating line that tells
you in which direction I'm going
to be moving.
And for each direction,
I have a plane.
And, when I slice by that
plane, I will get,
so I have this direction here
going maybe to the southwest.
So, that gives me a slice of my
graph by a vertical plane,
and the slice has a certain
slope.
And, the slope is going to be
the directional derivative in
that direction.
OK, I think that's as graphic
as I can get.
OK, any questions about that?
No?
OK, so let's see how we compute
that guy.
So, let me just write again
just in case you want to,
in case you didn't hear me it's
the slope of the slice of the
graph by a vertical plane -- --
that contains the given
direction,
that's parallel to the
direction, u.
So, how do we compute it?
Well, we can use the chain rule.
The chain rule implies that
dw/ds is actually the gradient
of w dot product with the
velocity vector dr/ds.
But, remember we say that we
are going to be moving at unit
speed in the direction of u.
So, in fact,
that's just gradient w dot
product with the unit vector u.
OK, so the formula that we
remember is really dw/ds in the
direction of u is gradient w dot
product of u.
And, maybe I should also say in
words, this is the component of
the gradient in the direction of
u.
And, maybe that makes more
sense.
So, for example,
the directional derivative in
the direction of I hat is the
component along the x axes.
That's the same as,
indeed, the partial derivatives
in the x direction.
Things make sense.
dw/ds in the direction of I hat
is, sorry, gradient w dot I hat,
which is wx,maybe I should
write, partial w of partial x.
OK, now, so that's basically
what we need to know to compute
these guys.
So now, let's go back to the
gradient and see what this tells
us about the gradient.
[APPLAUSE]
I see you guys are having fun.
OK, OK, let's do a little bit
of geometry here.
That should calm you down.
So, we said dw/ds in the
direction of u is gradient w dot
u.
That's the same as the length
of gradient w times the length
of u.
Well, that happens to be one
because we are taking the unit
vector times the cosine of the
angle between the gradient and
the given unit vector,
u, so, have this angle, theta.
OK, that's another way of
saying we are taking the
component of a gradient in the
direction of u.
But now, what does that tell us?
Well,
let's try to figure out in
which directions w changes the
fastest,
in which direction it increases
the most or decreases the most,
or doesn't actually change.
So, when is this going to be
the largest?
If I fix a point,
if I set a point,
then the gradient vector at
that point is given to me.
But, the question is,
in which direction does it
change the most quickly?
Well, what I can change is the
direction, and this will be the
largest when the cosine is one.
So, this is largest when the
cosine of the angle is one.
That means the angle is zero.
That means u is actually in the
direction of the gradient.
OK, so that's a new way to
think about the direction of a
gradient.
The gradient is the direction
in which the function increases
the most quickly at that point.
So, the direction of gradient w
is the direction of fastest
increase of w at the given
point.
And, what is the magnitude of w?
Well, it's actually the
directional derivative in that
direction.
OK, so if I go in that
direction, which gives me the
fastest increase,
then the corresponding slope
will be the length of the
gradient.
And, with the direction of the
fastest decrease?
It's going in the opposite
direction, right?
I mean, if you are on a
mountain, and you know that you
are facing the mountain,
that's the direction of fastest
increase.
The direction of fastest
decrease is behind you straight
down.
OK, so, the minimal value of
dw/ds is achieved when cosine of
theta is minus one.
That means theta equals 180�.
That means u is in the
direction of minus the gradient.
It points opposite to the
gradient.
And, finally,
when do we have dw/ds equals
zero?
So, in which direction does the
function not change?
Well, we have two answers to
that.
One is to just use the formula.
So, that's one cosine theta
equals zero.
That means theta equals 90 degrees.
That means that u is
perpendicular to the gradient.
The other way to think about
it, the direction in which the
value doesn't change is a
direction that's tangent to the
level surface.
If we are not changing a,
it means we are moving along
the level.
And, that's the same thing --
-- as being tangent to the
level.
So, let me just show that on
the picture here.
So, if actually show you the
gradient, you can't really see
it here.
I need to move it a bit.
So, the gradient here is
pointing straight up at the
point that I have chosen.
Now, if I choose a slice that's
perpendicular,
and a direction that's
perpendicular to the gradient,
so that's actually tangent to
the level curve,
then you see that my slice is
flat.
I don't actually have any slop.
The directional derivative in a
direction that's perpendicular
to the gradient is basically
zero.
Now, if I rotate,
then the slope sort of
increases, increases,
increases, and it becomes the
largest when I'm going in the
direction of a gradient.
So, here, I have,
actually, a pretty big slope.
And now, if I keep rotating,
then the slope will decrease
again.
Then it becomes zero when I
perpendicular,
and then it becomes negative.
It's the most negative when I
pointing away from the gradient
and then becomes zero again when
I'm back perpendicular.
OK, so for example,
if I give you a contour plot,
and I ask you to draw the
direction of the gradient
vector,
well, at this point,
for example,
you would look at the picture.
The gradient vector would be
going perpendicular to the
level.
And, it would be going towards
higher values of a function.
I don't know if you can see the
labels, but the thing in the
middle is a minimum.
So, it will actually be
pointing in this kind of
direction.
OK, so that's it for today.
