 
117

CALCULUS

Book 2

## by

## Peter J. Ponzo

## TABLE OF CONTENTS

EXAMPLE PROBLEMS 4

ASSORTED PROBLEMS 5

LECTURE 1 10

INTRODUCTION TO DIFFERENTIAL EQUATIONS 10

POPULATION GROWTH 10

Logistic Equation 10

The World's Simplest DE 11

velocity and position 11

SEPARABLE DIFFERENTIAL EQUATIONS 11

population of a species 12

a certain species of clam 12

LECTURE 2 14

MORE ON DIFFERENTIAL EQUATIONS 14

a Little Partial Fractions 14

A spherical mothball 14

population of clams 14

EXPONENTIAL DECAY 15

An Egyptian scroll 15

the half-life 16

Newton's Law of Cooling 16

LECTURE 3 17

MORE ON DIFFERENTIAL EQUATIONS 17

Direction of Solutions 17

following the direction 17

the DE PORTRAIT 17

LINEAR FIRST ORDER DEs 19

the temperature of the object 21

a nice trig identity! 22

LECTURE 4 24

SEQUENCES AND SERIES 24

Sequences 24

PARTIAL SUMS 24

SERIES 26

the HARMONIC SERIES 27

LECTURE 5 27

CONVERGENCE of SERIES 27

A Test for Convergence of an Infinite Series 27

series where every term is positive 28

the nth term test 30

LECTURE 6 31

ALTERNATING SERIES and ABSOLUTE CONVERGENCE 32

ALTERNATING SERIES 32

the terms must get smaller fast! 32

the Alternating Harmonic Series 32

decrease 34

Estimating the Sum of a Convergent Alternating Series 34

ABSOLUTE CONVERGENCE 36

LECTURE 7 38

TAYLOR POLYNOMIALS and TAYLOR SERIES 38

TAYLOR POLYNOMIALS 39

LECTURE 8 42

INFINITE POWER SERIES 42

TAYLOR SERIES 42

CONVERGENCE of TAYLOR SERIES 43

the RATIO TEST 43

RADIUS OF CONVERGENCE 44

POWER SERIES 45

LECTURE 9 46

MORE ON SERIES 46

Estimating the Sum of an Alternating Taylor Series 46

the error is less than the first neglected term 46

magic of brackets 47

Estimating the Sum of \ 49

LECTURE 10 50

CURVES and PARAMETRIC EQUATIONS 50

Plotting Parametric Curves 53

Slope of the Tangent Line to a Parametric Curve 54

think of the parameter \ 54

the velocity vector 54

LECTURE 11 56

MORE ON PARAMETRIC REPRESENTATION OF CURVES 56

the Tangent and Position Vectors 56

the derivative of a vector 56

Length of a Curve 58

LECTURE 12 58

SOME APPLICATIONS 58

Polar Curves, revisited 58

TRAPEZOIDAL RULE 59

the CYCLOID 60

the Straight Line 61

that terrible curve 62

an Astroid 64

LECTURE 13 65

FUNCTIONS OF TWO VARIABLES 65

LEVEL CURVES 66

an Orthogonal Trajectory 68

3 Dimensional Surfaces 68

a cylinder, one variable is missing 70

Revolving 2-D Curves to get 3-D Surfaces 71

LECTURE 14 75

DERIVATIVES OF FUNCTIONS OF TWO VARIABLES 75

the PARTIAL DERIVATIVE 75

pressure of a gas 76

A square box 77

The profit per hat 77

HIGHER PARTIAL DERIVATIVES 78

LECTURE 15 79

DIRECTIONAL DERIVATIVES 79

LEVEL CURVES again 80

The density of ants 82

In what direction 82

The temperature of a plate 83

LECTURE 16 83

the GRADIENT 83

VECTORS 84

unit vectors 84

Direction Cosines 84

the gradient vector 85

Is that an accident? 85

A little More About Vectors 85

the DOT Product 86

the angle between vectors 87

If P•Q = 0 then they must be perpendicular 88

LECTURE 17 89

more on the GRADIENT, and the CHAIN RULES 89

The distribution of a certain type of plant 90

let EMBED Word.Picture.8 90

The CHAIN RULES 92

this dimensional stuff 94

LECTURE 18 94

another CHAIN RULE 94

thermodynamics is cover-to-cover partial derivatives 95

a COLLECTION of CHAIN RULES 95

Directional Derivatives, revisited 96

Implicit Differentiation, revisited 96

the GRADIENT vector is normal to the level curve 97

LECTURE 19 97

the TANGENT PLANE 97

LECTURE 20 101

OPTIMIZATION 101

Least Squares Fit 107

SOLUTIONS TO \ 110

EXAMPLE PROBLEMS

(done in the text)

• Assuming that the population of a species increases according to = k N, determine N(t).

• A spherical mothball evaporates at a rate proportional to its surface area. If it starts with a radius of 2 cm and is reduced to .6 cm after 10 hours, how long will it take to evaporate completely?

• An Egyptian scroll is discovered in which the ratio of 14C to 12C is .6 of the value it would have in similar material today. Estimate the age of the scroll.

• Sketch the DE PORTRAIT for = x2 \+ y2.

• Water flows into lake Ontario at the rate of A _metres_ 3 _/day_ (from rivers and rain, etc. as well as from liquid industrial waste) and this water has an average pollution concentration of B _kg/metres_ 3. Water (mixed with pollutants) is also withdrawn at the rate of C _metres_ 3 _/day_. Describe, via a differential equation, the amount of pollutants at time t days (after measurements begin).

• Show that the series converges.

• Determine the Taylor polynomials for f(x) = _ln_ x about x = 1.

• Prove that the series diverges for every value of x different from 0.

• Sketch x = t2 \- 1, y = t3.

• For the polar curve r = sin (2), determine the slope at  = . Also, express as a definite integral the length of the curve from  = 0 to  = .

• You are standing on the side of a mountain whose elevation is given by z = 95 - x2 \- y2 +2x + 4y metres, where x = 0, y = 0 is your location, so z = 95 is your elevation. Sketch the level curves in your neighbourhood and determine in what direction you should climb so as to increase your elevation most rapidly.

• The pressure of a gas P depends upon its volume V and temperature T according to PV = kT where k is a constant. If P = 1, V = 2 and T = 3, how rapidly is the pressure changing when V alone changes?

• The density of ants at a location (x,y) is given by D(x,y) = K _ants per metres_ 2 where x and y measure distance (in _metres_ ) from the queen ant who is located at (0,0): x is east-west and y is north-south distance. What is the rate of change of ant density at (0,0), in a north-west direction?

• Examine z = xy + + for a minimum, in x > 0, y > 0.

• Determine the line of "least squared error" to fit the points (.2,.6), (.6,.9), (.9,1.5) and (1.2,1.7).

ASSORTED PROBLEMS

(which you'll be able to solve by the end of this course)

1. When observed over some time interval, it is noted that the population of fish in a lake changes as shown below. Assuming a logistic population growth: F(t) = , determine K, the carrying capacity of the lake, and the values of F(0) and r. (You may find it convenient to let y = .)

2. Solve the following DEs:  
(a) 3 x2 y + 2 x + x3 = 0  
(b) + 2 x y = x

3. Suppose that the elevation of a mountain is described by 5 x2 \+ 4 x y + 2 y2 = C (where C is the elevation). Determine a DE satisfied by the curve through (2,5) down which you should ski so as to achieve the steepest descent. (i.e determine the DE for orthogonal trajectories for this family of curves.) DO NOT SOLVE. (It's neither _separable_ nor _linear_!)

4. Sketch some level curves of the function f(x,y) = 1 - x2 \+ y2.

5. Obtain the equation of the tangent plane to the surface z = x2+xy at the point (1, 2, 3).

6. Determine whether the given series converges or diverges.

If it converges, determine whether it converges _absolutely_ .

(a) (b) (c)

(d) (e) (f)

(g)

7. Solve each of the following:

(a) = x ey (b) = 1 - y2

(c) + xy = x (d) = 2

8. Suppose that f(t) and g(t) are continuous functions of t for each t in the

interval (a,b). The equations x = f(t), y = g(t) give a point (x,y) in the plane

and the set of all such points, for a≤t≤b, is called a _parametric curve_.

If f(t) and g(t) are differentiable, the slope of the tangent line to this curve at

(x(t),y(t)) is : = .

Further, the _length_ of the curve, from t = a to t = b, is given by:

L = |

---|---

For each of the following: (i) _plot_ the parametric curve for a≤t≤b

(ii) determine at t = t0

(iii) calculate the length of the curve from t = a to t = b

(a) x = 2 cos t, y = 2 sin t , (a,b) = (0,2π), t0 = π/4

(b) x = a cos3t, y = a sin3 t, (a,b) = (0,2π), t0 =π/4

(c) x = 2 cos t, y = 3 sin t, (a,b) = (0,2π), t0 = π/2 (leave the arc length, L, as an integral!!)

9. Obtain Pn(x), the Taylor polynomial of degree n, for f(x), about x = a, as indicated below:

(a) f(x) = sin x, n = 6, a = 0 (b) f(x) = sin x, n = 6, a =

(c) f(x) = ex, n = 4, a = 1 (d) f(x) = _ln_ (1+x), n = 4, a = 0

(e) f(x) = tan x, n = 5, a = 0 (f) f(x) = arcsin x, n = 3, a = 0

(g) f(x) = , n = 3, a = 0 (h) f(x) = sinh x, n = 5, a = 0

(i) f(x) = cosh x, n = 5, a = 0 (j) f(x) = , n = 3, a = 0

Note: In (h) and (i), sinh x = and cosh x = . Note: sinh x = cosh x, cosh x = sinh x.

10. For each of the following, determine the Taylor series about x = a and determine its interval of convergence:

(a) f(x) = _ln_ (1+x), a = 0 (b) f(x) = _ln_ x, a = 1

(c) f(x) = e2x, a = 0 (d) f(x) = sin x, a =

(e) f(x) = , a = 0 (f) f(x) = sinh x, a = 0

11. Assume that the solution to the differential equation = y may be expanded in a power series:

y = a0 \+ a1 x + a2 x2 \+ a3 x3 \+ ... = . Substitute this series into = y and find constants

a0 , a1, a2, ... , an , ... so as to satisfy this differential equation.

(i.e. so that = a0 \+ a1 x + a2 x2 \+ a3 x3 \+ ... , for all x.)

12. Use 5 terms of the Maclaurin series (i.e. the Taylor series expanded about x = 0) for ex to approximate e and show that the error is less than ... and evaluate this error estimate.

13. The well-known physicist, Richard Feynman, writes:

" _Thus the average energy is_ <E> = . _Now the two sums which appear here we shall leave for the reader to play with and have some fun with. When we are all finished summing and substituting for x in the sums, we should get - if we make no mistakes in the sum-_ <E> = . _This, then, was the first quantum mechanical formula ever known, or discussed, and it was a beautiful culmination of decades of puzzlement._ " Have some _fun_ and obtain the expression for <E>, given that

x = e-h/2πkT. You can easily sum the geometric series 1 + x + x2 \+ ... but you'll have _fun_ with x + 2x2 \+ 3x3 \+ ... but note that it's almost (but not quite) the series obtained by differentiating 1 + x + x2 \+ x3 \+ ...

14. If the graph of z = f(x2), in the x-z plane, is revolved about the z-axis, the 3-D surface generated is described by z = f(x2 \+ y2). (For example, the parabola z = x2 generates the paraboloid z = x2 \+ y2).

Further, the 3-D surface described by y = f(x) (in an x-y-z coordinate system) is a cylinder generated by moving the 2-D curve y = f(x) (in the x-y plane) parallel to the z-axis. |

---|---

Use this to sketch the graph of the following functions. Identify the surface as a paraboloid, hyperboloid, cylinder, plane or cone.

(a) z = 1 - x2 \- y2 (b) z =

(c) z = x (d) y = x2

(e) z2 = 1 +x2 \+ y2 (f) z2 = 1 + x2 \- y2

15. Sketch some level surfaces, f(x,y) = constant, for each of the following:

(a) f(x,y) = x - y (b) f(x,y) = x2 \+ 2 y2

(c) f(x,y) = x e-y (d) f(x,y) =

16. Find the equation of the tangent plane to the graph of the given function at the point indicated:

(a) z = arctan at (-1,1) (b) z = at (-3,4)

(c) z = at (0,2) (d) z = _ln_ ( 1 + exy ) at (0,1)

17. If u(x,t) represents the temperature at the position x at time t in an insulated rod, then u satisfies the so-called

Heat Equation: = k where the constant k is called the diffusitivity (related to the thermal conductivity). Show that each of the following satisfies the Heat Equation for _some_ value of k and determine this k-value.

(a) u(x,t) = e-2t sin x (b) u(x,t) =

18. If u(x,y,t) represents the elevation of a vibrating elastic membrane above the x-y plane (think of a drum head) at the position (x,y) at time t, then u satisfies the so-called Wave Equation: + = where c is the speed of propagation of waves on the membrane. Show that each of the following satisfies the Wave Equation for _some_ c>0 and determine this c-value:

(a) u(x,y,t) = sin x sin 2y cos 3t (b) u(x,y,t) = e3x + 4y + 5t

19. Use the Chain Rule to determine the indicated derivative:

(a) when t = 0 if z = x2 \+ y2 and x = et, y = sin t

(b) when t = π if z = x2 \+ y2 and x = cos t, y = sin t

(c) when z = 1 if w = u2 \+ u v and u = _ln_ z, v = z2

(d) when x = if y = u Arctan v and u = sin x, v = tan x

(e) when t = if w = x2 \+ y2 \+ z2 and x = sin t, y = cos t, z = t

20. The elevation of a mountain is given by z = 1000 \- x2 \- 2 y2 where z (metres) is the elevation and x and y (each measured in metres) are the horizontal distances from the centre of the mountain. If a person is located at

x = 3 + 2t, y = t2, at time t (in seconds), then find the rate at which the person's elevation is changing (in metres per second) when t = 2 seconds.

21. For each of the following determine : (i) by implicit differentiation, and (ii) by using = - :

(a) f(x,y) = x2 \+ y2 \- a2 = 0 (b) f(x,y) = x2 y + sin (x y) - x = 0

(c) f(x,y) = arctan ( x2 \+ y2) \- x - y = 0 (c) f(x,y) = x ey \+ _ln_ (1 + x) - 1 = 0

22. The gradient of a function f(x,y) is a _vector_ with x- and y-components and respectively.

i.e.  f = [ , ]. For each of the following functions determine :

(i) the gradient vector at the given point and

(ii) the equation of the tangent plane to the surface z = f(x,y) at the given point:

(a) f(x,y) = arctan at (1,1) (b) f(x,y) = _ln_ (x2 \+ y2) at (0,1)

(c) f(x,y) = sec x + sec y at (0,0) (d) f(x,y) = 2x \+ 3y at (1,1)

23. The rate of change of f(x,y) in the direction  (the directional derivative) is cos  \+ sin  .

Compute the rate of change of f(x,y) = x4 y5 at (1,1) in the given direction:

(a) = 0 (b)  = π (c)  = (d)  =

24. If = 1 and = , in what direction, , should a directional derivative be computed, at (a,b), in order that it is:

(a) 0 (b) as small as possible (c) as large as possible (Note: choose  in (0,2π) ).
LECTURE 1

INTRODUCTION TO DIFFERENTIAL EQUATIONS

**POPULATION GROWTH** **:**

PS:

P: How would you describe, in mathematical terms, the growth of a population?

S: Huh?

P: If I said the population of a city or a country or the world was growing at 2% per year, how would you put this into mathematical terms?

S: I'd say ... uh ... I don't understand the question.

P: Okay, suppose the population was N people after a time t years ... "t" could be the time, in years, since you started to measure the population; maybe t = 0 corresponds to the year 1925 so t = 5 corresponds to 1930 and so on. Every year the population grows by 2% so that, if there are N people at the time t, there will be 1.02 N one year later and the increase in population will be .02N people per year. See? People per year. This .02N is a rate of change of population with respect to time and it's proportional to the current population so we could write: where "k" is some constant of proportionality (like .02, for example). Now, the big question: what must be the function N(t) so that it satisfies = kN ? What do you think?

S: I haven't the foggiest.

The equation = k N is called a _differential equation_. Any equation involving some unknown function and its derivatives is called a differential equation (or DE for short). In our case the unknown function is N(t). (We assume that the constant "k" is known.)

**Example DEs:** (in what follows, "k" and "K" and "C" are constants)

• = .98 t is a DE which describes the distance fallen by an object (namely s(t) metres) in a time t seconds.

• = k is a DE which describes the relationship between the size of a persons eye (namely y) in relation to the size of the head (namely x).

• = - k (T - K) is a DE which describes the temperature T of some object which is cooling in air which is at a temperature K.

• = - k N (N - K) is a DE (the so-called "Logistic Equation") which is an alternate description of the growth of populations. Here N(t) is the population at time t and the constant K is called the "carrying capacity".

It is often the case that the mathematical description of some process (like population growth) is most easily obtained as a DE. The problem is: given a DE involving some unknown function, what is this function which "satisfies" the DE?" or, to put it differently, "what are the _solutions_ to the given DE?"

• s(t) = .49 t2 is a solution to the DE: = .98 t ... and so is s(t) = .49 t2 \+ 30 or s(t) = .49 t2 \- π. (You just substitute s(t) into the DE to see if = .98 t)

• y = xk is a solution to the DE: = k since = k xk-1 and k = k = k xk-1 as well. Note, however, that y = 10xk is also a solution as are y = πxk and y = - 47 xk. In fact, y = C xk satisfies the DE for any choice of constant "C".

• T(t) = K + e-kt is a solution of = - k (T - K), and so is T(t) = K - e-kt and T(t) = K - 39 e-kt and, in fact,

T(t) = K + C e-kt satisfies the DE = - k (T - K) for any choice of constant C.

• N(t) = is a solution of the logistic equation = - k N (N - K) for any choice of constant C.

We note several things:

(1) There are many solutions to a given DE.

(2) The solutions all have a constant which can have any value. Every value assigned to this constant (called "C" above) gives a solution.

(3) Given a DE and a solution, it is a simple matter to check that it is a solution: just substitute back into the DE!

Now, the big problem: HOW DO WE FIND THE SOLUTIONS?

PS:

S: Are DEs really that important? I mean, do they really pop up in real problems or are they just toys for the mathematicians to play with?

P: It's a curious thing, but many real problems are more easily stated in terms of rates of change ... or derivatives, and that gives rise to DEs. In trying to find some quantity (like the population N(t) or the distance s(t)) we often know something about its rate of change, like "the rate of change of population is 2% per year". Now it's up to us to find the quantity itself ... from the DE. For that reason we'll study a few simple types of DEs and their method of solution.

**The World's Simplest DE** **:**

If we're lucky enough to have a DE such as = .98 t, we just integrate both sides with respect to t. In this case, s(t) is simply an antiderivative or indefinite integral of .98 t: s(t) = = .98 + C where we've added a constant of integration, C. In general these "World's Simplest DEs" have the form: = f(x) and the solutions are just y = where we'll add a constant C after performing the integration ... and the constant can have any value, so we have a whole family of solutions, one for each choice of the constant.

The DE = f(x) has solutions y = + C

Although not necessary, we've indicated the constant of integration "C" even before we've completed the integration ... just to remind ourselves that it must be there!

**Example:** A body falls from rest under the influence of a constant gravitational acceleration of .98 metres/second2. Determine its velocity and position at any time t seconds.

**Solution:** Let v(t) be the downward velocity at time t, then = .98 (Note that v is measured in m/s so is measured in m/s2 and we're happy because that's what the gravitational acceleration is measured in!) Now we integrate to get v(t) = = .98 t + C. We are given that, initially, at t = 0, the body is at rest meaning that v(0) = 0. This gives us the value of C: we substitute v = 0 and t = 0 and get 0 = 0 + C so C = 0, hence v(t) = .98 t m/s. Now let s(t) be the distance fallen (in metres) so that = v = .98 t, hence (integrating) we get s(t) = .49 t2 \+ C1 (where we've called the constant of integration C1 so as not to confuse it with the previous constant, C). At t = 0 the body hasn't fallen at all hence s(0) = 0 so we substitute s = 0, t = 0 and find that 0 = 0 + C1 so C1 = 0 and the distance fallen is now: s(t) = .49 t2 metres.

Note: Had the gravitational acceleration been given as _g_ m/s2 (rather than .98 m/s2) we'd get = _g_ and v = _g_ t and s = _g_ t2 ... and this would be good on any planet (where _g_ might be different than on this planet).

**SEPARABLE DIFFERENTIAL EQUATIONS** **:**

The world's second simplest DE has the form: = f(x) g(y) where the right-side can be factored into the product of a function only of y and a function only of x. These, too, are easy to solve:

**Example:** Solve = k .

Solution: We "separate the variables", collecting on the left-side all functions of y together with the "dy" and leaving the rest on the right-side together with the "dx", and rewrite the DE in the form: = k . Now we integrate each side: = hence _ln_ |y| = k _ln_ |x| + C. In many cases we can't find y explicitly, but here we can ... and we do. Exponentiating each side we get |y| = ek ln |x| + C = eC eln |x|k = C1 |x|k. This gives our family of solutions: | y | = C1 | x |k where, since the constant of integration, C, was arbitrary, then so is eC which we've rewritten as C1. Note, however, that C1 = eC implies that C1 is positive (since eC is always positive). Note, too, that if | y | = C1 | x |k, then y = ±C1 | x | k (a property of the absolute value!). Hence we can replace | y | by y if we admit positive and negative constants C1. We do this, and now have our solutions: y = C2 | x |k where, of course, C2 can be any constant, either positive or negative. If we knew that x was positive (as would be the case if x measured the size of a person's head!), then | x | = x (a property of the absolute value!) and we could write our solutions as: y = A xk (where using "A" as the name of our "arbitrary constant" looks nicer than using C2).

PS:

S: Whoa! That's real confusing, isn't it? I mean, am I supposed to be able to do all that ... like talk about why we can drop the absolute value signs and all that stuff?

P: In many cases you'll know beforehand whether the quantities are positive, then you can just write = ln y and forget about using ln | y |. Sometimes, however, the quantities can take on negative values (perhaps x is a temperature and it's below zero) so you have to keep the absolute values in order not to eliminate any solutions. For example, had we written the solutions as y = A xk and x were negative and k = then we'd be in big trouble wouldn't we?

S: We would?

P: Sure, because you'd write y = A as the solution and x is negative and that's bad news, right? In fact, you'd really want to write your solution as y = A so y would be defined even for negative x-values.

S: Okay, but tell me ... how can you just "collect on the left-side all functions of y together with the dy"? Is that legal? I mean, you're breaking into two pieces, the "dy" and the "dx" and stuff like that. Can you do that?

P: That's the technique for solving these types of "separable" DEs (i.e DEs where you can "separate the variables"). I can do it differently if you'd like. Suppose I write = k as = k then I recognize the left-side as the derivative of ln | y | with respect to x. I can then rewrite the DE as: ln | y | = k . Now I say: "If ln |y| is k , then ln | y | must be an antiderivative, or indefinite integral, of k , so I'd put ln | y | = = k ln | x | + C and I'd get the same answer, right?

S: I'm sorry I asked. I think I'll stick with "collecting the y's with the dy" .

The separable DE = f(x) g(y) has solutions satisfying = + C

**Example:** Assuming that the population of a species increases according to = k N, determine N(t).

**Solution:** This is a separable DE so we "separate the variables", writing: = k dt then integrate each side to get = or _ln_ N = k t + C (where we know that N > 0 so we omit the absolute value sign). Exponentiating each side gives: N = ekt+C = eC ekt = A ekt (where we've replaced the "arbitrary constant" eC by A). We conclude that the population grows exponentially. In fact, if N = 10,000 at t = 0, then 10,000 = A e0 = A and we've identified the constant A for this particular population, so we'd write N(t) = 10,000 ekt.

S: Yeah, you've found A but you haven't found k.

P: The DE had a constant k and the solution adds another constant C (or A) so we'd need two pieces of information about our population in order to solve for these two constants. One piece was given, namely the initial population of 10,000 and that enabled us to determine A. What other piece of info would you like to provide?

S: Huh?

P: Suppose we actually count the population after, say, 5 years, and find it to be 13,000. Then we'd have N(0) = 10,000 (the initial population) and N(5) = 13,000 and that'd give us two equations to solve for the two constants A and k, namely: 10,000 = A e0 and 13,000 = A e5k. The first gives A = 10,000 and the second gives e5k = = 1.3 so 5k = ln (1.3) so k = and N(t) = A ekt = 10,000 e(t/5)ln 1.3 = 10,000 eln(1.3)t/5 = 10,000 (1.3)t/5 and there you have it!

S: Have what? What'd you do up there in the exponent?

P: I just used properties of logarithms and exponents: qln p = ln pq is one, and eln p = p is another, so whenever I see ea ln b I write the exponent as ln ba (using the first property) and then eln ba = ba (using the second property). But watch this: we can find k from the equation e5k = 1.3 (as we did above), but what we really want is ekt (not k itself). From e5k = 1.3 we get ek = (1.3)1/5 so ekt = (1.3)t/5. See how simple?

**Example:** The population of a certain species of clam is estimated at 10,000,000 then, a year later, at 17,000,000. What would the expected population be after 10 years?

**Solution:** We'll call the population N(t) after a time of t years, and we'll assume it satisfies the differential equation = k N where k is some as-yet-unknown constant. The solution (as obtained earlier) is N(t) = A ekt. We have two constants and two pieces of information: N(0) = 10,000,000 = A e0 and N(1) = 17,000,000 = A ek. We solve for A = 10,000,000 and ek = 1.7 so that N(t) = 10,000,000 (ek)t = 10,000,000 (1.7)t. When t = 10 we get a population of N(10) = 10,000,000 (1.7)10 which is about 2 x 1019 clams.

S: That's a lot of clams!

P: That's exponential growth for you. Of course, maybe the population doesn't satisfy = k N (which describes exponential growth). In fact, we'd expect that after a while there would be too many clams and too little food for them to eat and they'd die off like ... uh, flies. The DE doesn't take that into account; it just assumes a rate of growth that keeps on going and going. Maybe what we need is a DE whose solutions grow, then level off at some value. That's maybe what one would expect, right? See the diagram? N(t) increases rapidly at first, then begins to level off and approaches some value: K clams. Further, if N is too large, say N > K, then N should decrease (because |

---|---

the clams would die off due to lack of food). There is some population, namely N = K, where the population remains steady. Too many clams and they die off. Too few and they increase in number. What kind of DE would have such a solution?

S: I haven't the foggiest.

We start with = _something_ and this _something_ should be positive when N < K (because the solution is increasing) and negative when N > K. The simplest such DE would be = K - N or, to make it more general we might consider = k (K - N) where k > 0 is some constant. To solve this DE we would separate the variables, writing = k dt, then integrate to get - _ln_ | K - N | = kt + C so |K - N| = e-kt e-C so K - N = ± e-C e-kt = A e-kt hence we have our solution: N(t) = K - A e-kt and we notice that N(t)  K as t  ∞ as we expected. Further, if the initial population is given as N(0) at t = 0, we'd want N(0) = K - A so A = K - N(0) and, finally,

. Although this has some of the properties we want, it also suggests that N(t)  K as t  ∞ even if there are initially no clams! (i.e N(0) = 0). That's not good. If there are no clams at t = 0 there will _always_ be no clams.

What we want, then, is a DE where N = 0 is also a solution. One of the simplest is = - k N (N - K) where k and K are constants.

P: See? If N < K then > 0 and when N > K then < 0 and N(t) = 0 is a solution as well. Do you recognize this DE?

S: Nope.

P: It's the logistic equation. Note how nice it is. When the population is small, then N - K is approximately K (neglecting "N" compared to "K") so the DE looks like = - k N (- K) = kK N which, as we've seen, has exponentially growing solutions and that's exactly what we want. After all, we'd expect populations to grow exponentially until there are too many individuals for the environment to support ... limited food supply, etc. Let's solve it ... which shouldn't be too difficult since it's separable.

We rewrite the DE as: = - k dt and integrate to get = - kt + C. The integral on the left-side isn't one we've met before, but we can write = and integrate these simpler expressions. We'd get: = = _ln_ where we've used log A - log B = log . Note that we kept the absolute value sign around N - K because N could be greater or less than K. Okay, now we have _ln_ = - kt + C so _ln_ = - kKt + C1 (where we write the arbitrary constant KC as C1) and now we exponentiate to get = eC1 e-kKt = A e-kKt (where, again, we've relabelled our arbitrary constant). Now A = eC1 is always positive but we can eliminate the absolute value sign to get

= ± A ekKt and then absorb the ± into the constant A, letting it be either positive or negative, and then we solve for N(t) via: 1 - = A e-kKt so = 1 \- A e-kKt so, finally, N(t) = and we can even absorb another (-) into A and write: as the solution to the logistic equation, where k, K and A are constants. We note that N(t) = K (since e-kKt 0) so the eventual population does indeed approach this constant, limiting value.

S: You're doing a lot of this "absorbing" with your constants. Is that legal?

P: Well, I could write ±A = B (i.e. use a different symbol) then later get = using C to represent -B. See? Some people start off with their first constant labelled C1, then introduce C2, C3, etc. as needed. Come to think of it, you should do that, just so you don't get confused.

S: Another thing. How'd you integrate that ? You didn't teach me that ... did you?

P: You're right, I didn't. But let's do it in the next lecture ... you need a rest.

LECTURE 2

MORE ON DIFFERENTIAL EQUATIONS

**a Little Partial Fractions** **:**

The expression can always be written in the form + where the two _something_ s are constants. To determine what they are, just bring the right-side to a common denominator, add, and make sure it agrees with .

**Example:** Find constants P and Q so that = + .

**Solution:** First we bring the right-side to a common denominator and add: + = and note that the common denominator is just the original denominator. (It always happens that way!) Then we choose P and Q so the numerators are also equal: 2x - 3 = (P+Q) x + Q - 2P which requires 2 = P+Q and -3 = Q-2P. We solve these two equations in two unknowns for P and Q to get: P = 5/3 and Q = 1/3. (We also check that P+Q = 2 and Q-2P = -3 so we know we have the solution!) Hence we have: so, for example, we can now easily integrate = _ln_ |x+1| + _ln_ |x-2| .

S: You forgot the "+C". Anyway, is that what you call "partial fractions"?

P: Any rational function where the degree of p(x) is less than the degree of the polynomial q(x) can be written as a sum of fractions, each fraction having as its denominator a factor of q(x). That makes it much easier to integrate, right? And when you bring the sum to a common denominator and add, the common denominator is just q(x) so you just have to match up the numerator so it's p(x). It's fun. Would you like to see more of this partial fractions stuff?

S: Not unless it'll be on the final exam.

**Example:** Solve the DE: = .

**Solution:** We separate and write = then integrate each side to get = _ln_ | x | + C1. To integrate the left-side, we write = = + and add the fractions on the right-side to get = and match this with so we'd need A+B = 0 and A-B = 1 and we'd solve for A = and B = - then we'd integrate: = = _ln_ |y-1| - _ln_ |y+1| so that = _ln_ | x | + C1 becomes

_ln_ |y-1| - _ln_ |y+1| = _ln_ |x| + C1 which can also be written _ln_ = _ln_ x2 \+ C (where C = 2C1 and we've written 2 _ln_ |x| = _ln_ |x|2 = _ln_ x2 because clearly x2 is positive and we can drop the absolute value sign). Finally we exponentiate each side and get: = ±eC eln x2 = K x2 and we solve for y = where K is any constant.

S: Am I supposed to be able to do that?

P: Yes.

S: Mamma mia!

**Example:** A spherical mothball evaporates at a rate proportional to its surface area. If it starts with a radius of 2 cm and is reduced to .6 cm after 10 hours, how long will it take to evaporate completely?

**Solution:** We have assumed that = k A (since the rate of evaporation, , is proportional to the surface area A). But the volume of the spherical mothball is V = π r3 and its surface area is A = 4 π r2 where r is its radius, so we can write: = k (4 π r2) or 4πr2 = k (4πr2) so that = k, a constant (which is obviously negative if r is to decrease!). We then solve this world's simplest DE: = k to get r(t) = kt + C. We have two constants k and C and need two pieces of information. One is that r(0) = 2 cm; that gives the equation:

2 = 0 + C. The second piece of information is that r(10) = .6 and that gives the second equation: .6 = 10k + C. We solve two equations in two unknowns to get C = 2 and k = - .14 (which, as expected, is negative). Hence

r(t) = - .14 t + 2 (and we check to confirm that r(0) = 2 and r(10) = .6). The mothball evaporates completely when

r = - .14 t + 2 = 0 and that gives t ≈ 14.3 hours.

**Example:** The population of a certain species of clam is estimated at 10,000,000 then, a year later, at 17,000,000, then a year later at 21,000,000. What would the expected population of clams be after 10 years? Assume a logistic growth pattern: = - k N (N - K).

**Solution:** The solution to the logistic equation is: N(t) = . There are three constants A, k and K and, fortunately, three pieces of information given. We have N(0) = 10,000,000 = and N(1) = 17,000,000 = and N(2) = 21,000,000 = and from these three equations we have to determine the three constants A, k and K in order to compute N(10) = . Before proceeding, let's agree to measure clams in millions, then the three equations can be written more simply: = 10 and = 17

and = 21. In fact, to make life even simpler, let's give e-kK a simpler name: let x = e-kK so now our equations read: = 10 = 17 = 21

or perhaps: (1) 1+A = (2) 1+Ax = (3) 1+Ax2 = .

Then (1) - (2) gives ...... (4) A (1-x) = K and

(2) - (3) gives ......... (5) Ax (1-x) = K

and (5)/(4) eliminates both A and K and leaves a single equation for x = = . Substitute into, say (1) and (2) and get two equations for A and K which, when solved, gives: A = and K = . We then have

N(10) = = ≈ 23.0 which means there should be about 23,000,000 clams after 10 years.

S: Wow! That's a lot of work, isn't it?

P: And a lot of clams. But look at the number of clams: about 23,000,000. Now look at the value of K = = 23.025 million. See? K gives the eventual population, so after 10 years the clam population has pretty well stabilized at its so-called "carrying capacity" of 23 million. Note that this is a far cry from the previous estimate, when we used the DE = k N. That DE gave exponential growth and some 219 clams in 10 years, enough to cover the globe and ..

S: Should I know anything about clams?

P: You should be able to reproduce what I've done ... given enough time. You should certainly be able to understand what I've done.

S: I'm in big trouble.

**EXPONENTIAL DECAY** **:**

The equation = k N for population growth may be written = aN - bN where aN is the rate of increase in population due to births (and is assumed proportional to the current population) and bN is the rate of decrease due to deaths, so k = a \- b. If a > b then k > 0 and the population increases, but if a < b (meaning the death rate exceeds the birth rate) then k < 0 and the solution N(t) = A ekt decreases to a limiting value of zero. This is exponential decay.

Radioactive substances also "decay", emitting atomic particles and changing into other elements. Radium, for example, decays into lead and carbon 14 decays into something (which is no longer carbon 14!). In each case the DE which governs the decay is = - k M where M(t) is the amount of substance (radium or carbon 14, etc.) left at time t and we've actually put a (-) out front so we could recognize decreasing solutions, hence "decay". The solution is obtained by "separating the variables": = - k dt so = - k t + C so _ln_ M = -kt + C so M = eC e-kt which we write as: where M(0) is the initial amount (at time t = 0).

**Example:** Carbon 14 (denoted by 14C) decays radioactively, any original amount reducing to half after about 5600 years (called the "half-life"). For a plant, it is assumed that the ratio of 14C to 12C is constant* while the plant is living, but that 14C begins to decay as soon as the plant dies (so the ratio decreases). An Egyptian scroll is discovered in which the ratio of 14C to 12C is .6 of the value it would have in similar material today. Estimate the age of the scroll.

**Solution:** The DE governing the radioactive decay of 14C is = - k M with solutions M(t) = M(0) e-kt where the arbitrary constant is M(0) (the initial amount of 14C, measured in kg or any convenient mass unit) and t is measured in years from the time that the radioactive decay began. When t = 5600 the amount of 14C is so we have = M(0) e-5600k and we can solve for e-k = 1/5600 and the amount of 14C at any time t can now be written: M(t) = M(0)t/5600 (and we check that we do indeed get when t = 5600). After t years the amount of 14C in the scroll is only .6 of what it would have been at t = 0 so we write .6 M(0) = M(0)t/5600 and determine t, the age of the scroll. Cancelling M(0) and taking the _ln_ of each side and solving gives:

t = - 5600 ≈ 4100 years. In other words it takes 4100 years to reduce the ratio to .6 and this seems reasonable because, in 5600 years, the ratio would be reduced to 1/2.

**Example:** Show that the amount of a radioactive substance is M(t) = M(0) t/T where T is the half-life.

**Solution:** The solutions to = - k M are M(t) = M(0) e-kt and M(T) = = M(0) e-kT gives

e-k = 1/T so that M(t) = M(0) t/T .

P: Are you paying attention? See? The number is here more pertinent than the number "e".

S: What? I wasn't listening.

P: Remember when I said that "e" occurs in real problems in the form eax, not just ex, and that eax can always be written 10bx or 2cx or πdx or whatever. That is, you can always find numbers "b" or "c" or "d" so the base is almost anything you want ... and that makes "e" seem less significant, right?

S: If you say so.

Newton's Law of Cooling

If it is assumed that the temperature of a hot object cools at a rate proportional to how much hotter it is than its surroundings, then we get Newton's Law of Cooling: = - k (T - K) where K is the temperature of the surroundings (called the _ambient temperature_ ). This is a separable DE and can be easily solved: = - k dt

so = - k t + C so _ln_ | T - K | = -k t \+ C so | T - K | = e-kt+C = eC e-kt = A e-kt so T - K = ± A e-kt so (absorbing the ± into the arbitrary constant "A") we get T(t) = K + A e-kt. Note that T(t) = K and the object eventually cools to the temperature of its surroundings. Note, too, that there are three constants in the solution: K, A and k so we'd need three pieces of information in order to determine T(t).

**Example:** Your professor is found dead in his office, after the final exam. The coroner arrives at 9:00 a.m. and records the body temperature as 30.1˚ C. One hour later she notes the body temperature has dropped to 29.2˚ C. When did the _accident_ take place? (Assume Newton's law of cooling and a room temperature of 20˚ C. Note that normal body temperature is 37˚ C.)

**Solution:** The temperature at a time t hours after the accident is given by T(t) = K + A e-kt. We know several things:

(1) the surroundings are at room temperature, so and

(2) initially T(0) = and

(3) if 9 a.m. is U hours after the accident then T(U) = and

(4) one hour later T(U+1) =

Although we have four equations, that's necessary because we've had to introduce a fourth constant, U. We solve for the four constants K, A, k and U: K = 20 (from (1)) so A = 17 (from (2)) and e-kU = = and e-k(U+1) = = . Dividing the last two equations eliminates U and gives e-k = = so e-kU = can be rewritten (e-k)U = or U = . Now take the _ln_ of each side to get

U _ln_ = _ln_ hence U = ≈ 5.58 so this many hours had passed before the temperature dropped from 37˚ to 30.1˚. That places the accident 5.58 hours before 9 a.m., namely at 3:25 a.m.

S: Why was the prof in his office at 3 in the morning?

P: Working, working, working ...

S: That'll be the day.

LECTURE 3

MORE ON DIFFERENTIAL EQUATIONS

PS:

S: This DE stuff is pretty neat, but what if the DE is something I can't solve. I mean, what if ...

P: Don't worry, most DEs can't be "solved" in the sense of being able to express the solutions in terms of known functions.

S: You've said that before ... I think.

P: You've got a good memory. Consider the world's simplest DE: = f(x). The solutions are y = + C and if I can't integrate every function f(x) in terms of known functions then I can't solve the DE in terms of known functions. But that doesn't mean it doesn't have solutions, only that I can't express ...

S: ... the solutions in terms of known functions. Yeah, I can see that. But what about me? Am I supposed to be able to solve ...

P: There are only two types of DEs you'll see in this course ... well, maybe three types if you include the world's simplest, = f(x), which is really a problem in evaluating an integral. In fact, sometimes the solutions to DEs are actually called integrals, did you know that? And solving is sometimes called integrating a DE, did you know that?

S: Three types? Did you say three?

P: Pay attention. We'll talk about the next type in a minute but first let me say something about a DE of the form

= something involving x and y which, except for very simple right-sides, we won't be able to solve in terms of known functions. Nevertheless, we can say something about these and we can actually sketch solutions and maybe that's all we need to know.

S: Huh?

Direction of Solutions

**Example:** Sketch the solution to the DE: = - which passes through the point (2,2)

**Solution:** At the point (2,2) the solution has slope = - = -1, so we draw a tiny piece of the solution: a short line segment with slope -1. That brings us to another, nearby point and we compute the slope there as well, using = - and, as before, we draw a tiny piece of the solution with this slope bringing us to another nearby point ... and we repeat the process, always moving in the direction given by the DE. The result will be a curve as shown ===>>> |

---|---

In fact, the solution through (2,2) will move so that, wherever it finds itself, its direction is always - . If you look at the solution you'll see that the slope becomes more and more negative at y decreases to zero ... as required by the DE: = - . In fact, you can just as easily sketch the solution through any chosen point and ...

S: Hold on! I can solve that DE ... it's separable! The solution is ... uh, I separate variables and get y dy = - x dx so I integrate and get = - and that gives me = - + C or I could just write x2 \+ y2 = 2C and you can plainly see that the solutions are circles!

P: And what about the curve I sketched through (2,2). Doesn't it look like a circle?

S: Yeah, but you didn't have to do that. I mean ...

P: Okay, you sketch the solution to = x2 \+ y2 which passes through (0,1).

S: But I can't solve the DE!

P: Precisely my point. This technique of "following the direction", as given by the DE, will work even if you can't "solve" the DE. Even if you can solve the DE it's sometimes easier and more instructive to "follow the direction" given by the DE ... which we can call the "DE Slope".

Given a DE of the form = f(x,y) = _a function of both x and y_ , it is sometimes useful to fill the x-y plane with tiny pieces of solutions: pick a bunch of points (x,y) and draw a short straight line segment located at (x,y) with a slope given by f(x,y). In fact, from = f(x,y) we have, approximately, ≈ f(x,y) so y ≈ f(x,y) x and that gives the change in y for a small change in x ... so we can "follow the DE".

This collection of "solution pieces" is sometimes called a DIRECTION FIELD, but we'll just call it **the DE PORTRAIT** since it gives at a glance the personality of the DE and its solutions. After all, a portrait is worth a thousand words ... or even a thousand equations.

**Example** : Sketch the DE PORTRAIT for = x2 \+ y2

**Solution:** We can take a piece of the x-y plane, say the square -2 ≤ x ≤ 2, -2 ≤ y ≤ 2, and pick a gridwork of points in this square and at each point compute the value of x2+y2 and draw at that point a tiny line segment with this slope. This will give the portrait shown on the left, below:

Now we can pick a variety of starting points and sketch solutions to the DE through the selected point, being careful to follow the direction given by the portrait. This has been done above, at the right.

**Example:** Sketch the DE portrait for the Logistic Equation = N (5 - N) as well as some solutions.

**Solution:** Before we begin, let's try to predict the behaviour of solutions by inspecting the DE. Notice that, when 0 < N < 5 we have > 0 so the population N(t) increases --- the graph of solutions will have a positive slope. Also, when N > 5, note that < 0 so N(t) decreases and that should be reflected in the graph of N(t) versus t. Note, too, that N(t) = 5 is an exact solution since = 0 and 5 (5 - 5) = 0 so N(t) = 5 satisfies the DE. This solution is described graphically by a horizontal line.

The Portrait is shown at the right, as well as several solutions. === >>>

Since the Logistic Equation is a model for population growth, we note with pleasure that for initial populations greater than 5, the population decreases and eventually approaches 5. Initial populations smaller than 5 grow and again have 5 as their limiting value. As mentioned earlier, 5 is called the "carrying capacity" and indicates a population in balance with its food supply and environment.

We might also note that small populations increase rapidly until they reach half of the carrying capacity (in this case, 2.5) then the rate of change of population decreases. i.e. decreases, meaning is negative, meaning the curve becomes concave down.

It should be clear that the DE Portrait really tells almost everything you'd want to know about the solutions to a DE. |

---|---

S: But that's a lot of work! I mean, you have to pick dozens of points and calculate the slopes from the DE ... these DE Slopes ... and plot all those tiny lines. That's hard!

P: Not hard, just tedious. But if you're clever ... and I know you are ... you can sometimes simplify these calculations. For example, instead of picking a bunch of points (x,y) and determining the DE slope at each, you can pick the slope first, then determine the points which give that particular slope.

S: Huh?

P: Take, for example, = x2 \+ y2. We could ask "Where are all the points with DE slope = 1?" and the answer would be "Everywhere where x2+y2=1", meaning every point on this circle has DE slope = 1 so you'd just have to draw this circle and place a bunch of tiny lines with slope = 1 on the circumference. Nice, eh? And the points with DE slope = 2? Everywhere on the circle x2+y2=2. And ...

S: A picture is worth ...

P: Here's a picture === >>>

If you delete the circles (which shouldn't be confused with the solutions to the DE!) then you'd be left with the DE Portrait. |

---|---

**LINEAR FIRST ORDER DEs** **:**

Differential equations are classified according to the highest derivative that occurs.

• = x + y2 is a first order DE because the highest derivative is the first.

• + + etx = 0 is a second order DE since the highest derivative is the second.

In addition to the order of a DE, we can classify them as LINEAR and NONLINEAR. Linear 2nd order DEs can always be put into the form: _something_ \+ _something_ \+ _something_ y = _something_ where all those _something_ s are functions of the independent variable x and NOT of the unknown function y or its derivatives (else it'd be called nonlinear). A SECOND ORDER LINEAR DE has the form: A(x) + B(x) + C(x) y = D(x) and a FIRST ORDER LINEAR DE has the form: A(x) \+ B(x) y = C(x) and so on. If a DE isn't linear, it's nonlinear. Usually, first order are easier to solve than second which are easier than third, etc. and linear are easier than nonlinear ... but that's not always true.

**Example:** Solve the second order nonlinear DE: 2xy + 2x 2 \+ 4y = ex.

**Solution:** We recognize that the left-side is just the second derivative of xy2 so we can rewrite the DE as:

= ex hence we integrate each side to get = ex \+ A (adding a constant of integration, A) and (integrating once more) gives the solution as xy2 = ex \+ Ax \+ B (adding another constant of integration, B).

S: What! We recognize the left-side as the second derivative of xy2! I wouldn't recognize ...

P: I was just kidding. I actually invented the solution before I invented the DE. I just took xy2 = ex \+ Ax + B and differentiated each side twice. But see how easy it was to solve? And it's second order and nonlinear! On the other hand, what appears as a simple first order DE, = x + y2, is impossible to solve in terms of well-known functions. But if a first order DE happens to be linear, then we can always solve it.

First order linear DEs always have the form A(x) + B(x) y = C(x) but before we proceed we put it into "standard" form by dividing through by A(x) giving + y = and we'll let = P(x) and = Q(x) so our first order linear DE now has the "standard" form: . We illustrate the method of solution with an example:

**Example:** Solve + y = .

**Solution:** Multiply through by x to get x + y = e-2x. Now recognize the left-side as the first derivative of xy, so we rewrite our DE as: = e-2x and then we integrate each side to get: xy = - e-2x \+ C (adding a constant of integration, C) so we have our solution and, in this example, we could also solve for y =

S: There you go again ... recognizing the left-side as the derivative of something! How would I recognize that? And how did you know to multiply by x? And why did you ...

P: Okay, pay attention. Here's the technique ... and it's very clever.

Starting with + P(x) y = Q(x) (the linear first order DE in "standard form") we want to multiply by _something_ so that the resulting left-side is precisely the derivative of ( _something_ times y). That is, we multiply by, say, (x), giving (x) + (x) P(x) y = (x) Q(x) and insist that the left-side (x) + (x) P(x) y is exactly the derivative: . But = (x) \+ y so we'd stare at (x) + (x) P(x) y and see that we'd need y = (x) P(x) y and that means that (x) must be chosen so that = (x) P(x) which, although it's a DE to solve for (x), it's a _separable_ DE. We'd get: = Pdx so that = so that _ln_ || = and that means we can choose (x) = e∫P(x)dx. After multiplying by this function, (x), the DE can be rewritten = Q so we'd just integrate each side: y = and add an arbitrary constant ... and we'd be finished.

The linear first order DE: + P(x) y = Q(x) can be solved by first multiplying by

(x) = e∫P(x)dx.

It then becomes = (x) Q(x) and can be integrated directly.

S: Hold on! First, you forgot to add an arbitrary constant when you integrated P. You should have written

ln || = + C. Secondly, you ...

P: No. I only want some (x) that works ... one will do. I don't need every possible solution to the DE: = P, just one. Besides, if I did add a constant I'd get (x) = e∫P(x)dx+C= eC e∫P(x)dx so it'd just be multiplying my (x) by some constant, eC, and hence I'd be multiplying the original DE by such a constant. No need to do that. See?

S: Let me try one:

P: Okay, solve: x2 \+ x y = x e-2x.

S: Right! First I'd find  = e∫P(x)dx and P(x) = x (the coefficient of the y-term, right?) so  = e∫xdx = ex2/2 then I'd multiply the DE by this guy and get x2ex2/2 \+ x ex2/2 y = x e-2x (which looks pretty awful) and the left-side is exactly y so I'd rewrite the DE as = x e-2x then I'd integrate each side ... hey! That's tough, right? I don't know how to integrate ...

P: And no wonder ... you made so many mistakes. First off you forgot to put the DE into "standard" form (where the coefficient of is just "1") . That means you'd first have to divide by x2 and that'd give you + y = which you should recognize because I've just done that one! Besides that, you said the left-side is exactly y and you didn't even bother to check that! In fact, had you checked to see if x2ex2/2 \+ x ex2/2 y was really you'd find it wasn't, so you'd know that you'd made a mistake. So here's a piece of advice: after you think you've found (x), multiply the DE by this (x) (and don't forget to multiply the right-side too ... and you DID forget ... and that's another mistake!) then CHECK TO SEE IF THE LEFT-SIDE IS . If it is, then you've got yourself a correct integrating factor.

S: A correct what!

P: Oh, I forgot to tell you. The function because it allows us to "integrate" the DE. Did I tell you that? Solving a DE is sometimes called ...

S: Yeah ... integrating the DE ... I know, I know. Give me another one.

P: Okay, solve sin x + (cos x) y = tan x.

S: Right! First I divide through by sin x and get + y = then I'd pick out P(x) = and I'd integrate it to get = ... ?!$# Can I do that? ... uh, yes, I'd let u = sin x so that du = dx = cos x dx and the integral turns into = ln | u | and I'd forget the "+C" because ... uh, can't remember why ... then I'd have = ln |sin x | so I'd multiply through by this and ...

P: You'd what?!

S: Oh, forgot ... first I'd write  = e∫P(x)dx = eln |sin x| which is ... uh, don't tell me ... it's just | sin x | ... that's one of those log things ... then I'd multiply the DE by this guy ... where's the DE? Oh yeah, I'd multiply sin x \+ cos x y = tan x by |sin x| and I'd get ...

P: You'd what?!

S: Oh, sorry, I need the other DE, right? I mean, I'd multiply + y = by |sin x| and I'd get ... wow, that ain't easy, is it? Can't I just drop the absolute value thing. Sure, why not. I just multiply by sin x and I'd change the DE into ... uh, it becomes sin x + cos x y = tan x. Hey! That's what I started with! What's happening here?

P: And is the left-side exactly the derivative of ... of what?

S: Oh yeah, I know. It should be the derivative of  y or (sin x) y and that's (sin x) + cos x y and it IS! How'm I doin' boss?

P: Keep going, you're not finished.

S: Okay, the DE after I've fixed it up is = tan x ... and if I were smart I'd have recognized it right off the bat ... so I'd integrate each side and I'd get (sin x)y = = ln | sec x | ... I actually remembered that one ... and I'd be finished, right? I mean, that's the solution, right?

P: Wrong. Where's your arbitrary constant?

S: I decided to drop it. You did, remember? You said you only wanted one solution, any one, so you just upped and left out the +C.

P: That's because I wanted only one integrating factor, (x). For the DE I want ALL solutions, not just one.

S: Well, I have one for you and it's y = . Like it?

P: It's called a "particular solution" as opposed to the "general solution", and yes, I like it very much, but is it a solution? Remember, you decided to drop the absolute value sign. How would you check it?

S: I haven't the foggiest. Oh, wait, I just plug it in, right? I mean, I just put y = and see if

sin x + cos x y = tan x. Okay, I get = and I'd substitute into the DE and I'd get sin x + cos x on the left-side and this had better be tan x when the smoke has cleared ... uh, is it? I leave it as an exercise for the prof.

When solving DEs you often resort to a "Table of Integrals". In the following, you may select from the following table (which omits the +C):

= arctan = arcsin = _ln_

= = _ln_

**Example:** An object stands in a field and heats and cools as the day progresses. If the air temperature varies according to K = 20 + 5 sin degrees Celsius (hence varies from 15˚C to 25˚C over a 24 hour period), determine the temperature of the object as a function of time t (measured in hours) assuming Newton's Law of Cooling.

**Solution:** If T(t) is the temperature of the object at time t, then Newton's Law states: = - k (T - K) which would be separable if the air temperature K were constant, but it isn't! In fact, it's linear first order so we write it in "standard form": + k T = k (20 + 5 sin ) where we've substituted for K. An integrating factor is (t) = e∫k dt = ekt and after multiplication by ekt the DE becomes ekt \+ k ekt T = k (20 ekt\+ 5 ekt sin ) and we check to see that the left-side is exactly T = ekt T which it is (and we're happy). Hence we can rewrite the DE as:

= k (20 ekt\+ 5 ekt sin ) and integrate each side to get (with the help of the table of integrals above):

ekt T = 20 ekt +5k + C or, more simply

T(t) = 20 + + C e-kt.

S: Are you finished?

P: Sure. I wanted to get the temperature and I got it.

S: But what about the +C? Who is C? You gotta find C!

P: It's impossible without more information. After all, suppose the object was originally at 200˚, then the solution would be quite different than if it were originally at - 150˚. See? The constant C will depend upon ...

S: Okay, put the object in the sun at ... uh, 100˚ Celsius. Then what?

P: Then at t = 0 I'd put T = 100 and get 100 = 20 + + C and that'd give me C, see? But actually it matters little because whatever C is, the term C e-kt0 and the temperature becomes just

T(t) = 20 + and you get this variation in temperature of the object no matter what temperature it started with ... if you wait long enough.

S: Seems you have to wait forever for C e-kt0, right?

P: Oh well, the math is just an approximation anyway. When this term is just a fraction of a degree we can ignore it ... and that might just be an hour or two.

S: The math is just an approximation? You always told me that ...

P: Look, how good is Newton's Law of Cooling in this situation? And what happens if the object is in the shade, under a tree, then the sun hits it, so the temperature isn't proportional to the air temperature as Newton's Law assumes, but the object gains heat directly from the sun. Does the math know that? No! And what if there's a wind which cools the object and what if we were to take into account the heating and cooling from the ground and what if ...

S: Okay, okay, but if the math is only an approximation then what good is it? I can just say the temperature of the object is about 20˚ and that'd be an approximation too, right? And I wouldn't need anybody's law of cooling for that.

P: You're right, of course, but I'd bet a tidy sum that my approximation is much better than yours ... and that's what the math is doing for us. Remember, we have to use some cerebral prowess as well as turning a mathematical crank. When we counted the number of clams we got 1019 or something like that. It was a ridiculous answer and we should know that ... so our DE provided a poor description of the population growth ... so we changed the DE to the logistic equation and got a much more reasonable answer, but it was still an estimate after all. If you didn't care about how good the estimate was, you could certainly say "I think there'd be roughly 100 million clams" and leave it at that.

S: What about the object in the field? Do you think you've got a good estimate of the temperature?

P: Let's compute some values. Our temperature was T(t) = 20 + ... after some time when the effects of the initial temperature, contained in the term C e-kt, die out. Okay, how big does T get, and how small? I wouldn't trust this result if T varied from - 10˚ to over 100˚. So tell me, how big does T get?

S: Are you kidding? That's tough! I mean, how big does get?

P: It only looks tough because of all those symbols. Let's make it simpler, in fact we'll take a specific example, say y = 3 sin x \+ 4 cos x. How big does that get? Let's plot it.

Does it look familiar?

S: It gets as big as 5!

P: And it's looks like a sine curve, but shifted. It could be A sin (x + B) if I could pick the numbers A and B correctly.

S: But you just wanted to know how big it is ... and I already told you. It gets as big as 5!

P: Pay attention. We're going to learn something exciting here. We'd like to have 3 sin x + 4 cos x = A sin (x + B), if we can. The left-side has separate sines and cosines so we'd want that on the right too, so we'd write

A sin (x+B) = A using a well-known trig formula ...

S: Well-known to you maybe, but ...

P: ... so we now have to find numbers A and B such that 3 sin x \+ 4 cos x = (A cos B) sin x + (A sin B) cos x so we have two equations in two unknowns, namely: A cos B = 3 and A sin B = 4. To solve we could divide and eliminate A getting tan B = and that'd give us B (if we wanted it, but we don't because we're really more interested in A, the amplitude of A sin (x+B)) so to find A we can square and add: (A cos B)2 \+ (A sin B)2 = 32 \+ 42 because that'd get rid of B (since cos2B \+ sin2B = 1). We then get A2 = 32 \+ 42 so that A = = and ...

S: See! I told you it was 5 but you weren't paying attention!

P: I'm much more interested in the result of this analysis. It says that the expression 3 sin x + 4 cos x can be written in the form sin (x+B) and that means it gets as large as and as small as - and, in general, ...

S: You keep saying but I'm telling you it's 5! Aren't you any good at arithmetic?

P: I keep the 3 and 4 separate so I can see what happens in general ... and this is what happens:

P sin t + Q cos t = sin (t + B) for some number B

Now that's a nice trig identity!

S: How about the thing that's waiting in the field? How hot does it get?

P: You figure it out.

S: Okay, I just need to find P and Q and they're ... uh, we were asking how big got, so P = k and Q = - so it gets as big as ... but I don't know k, do I?

P: No, but let's be careful. The temperature was T(t) = 20 + and so gets as large as 20 + which is the same as 20 + . To find k we'd need to have more ...

S: Yeah, I know, you need more information . What more do you want?

P: Just one temperature reading should do it because there's just one constant we need, namely k. But we needn't go out with our thermometer because we just wanted to know if our expression for T(t) should be trusted. Notice that it varies sinusoidally between 20 + and 20 - , much like the air temperature does. Remember, the air temperature was K = 20 + 5 sin and varied from 20 + 5 to 20 - 5. In fact the term isn't even as large as 5, no matter what value k has.

S: Huh?

P: Don't you see? is larger than which is just k, so is smaller than 1 so the temperature variations of are actually smaller than 5 (1) = 5. Nice, eh? I think we may have something here. Doesn't it give you some faith in the analysis?

S: Say, do I have to know this for the final exam?

P: Before we leave this, let's pretend we've measured the maximum temperature of the object in the field and it's 23˚ Celsius ... and we assume the object has been sitting there for some time so we can ignore the Ce-kt term in the solution. Then 20 + = 23 so we can solve for k and it's ... uh, let's see, it's ...

S: Let me do it. I'd get = 3 so I'd square and get = 9 so

25 k2 = 9 k2 \+ 9 2 so 16 k2 = 9 2 and k = = . Good?

P: Good. Now let me plot the temperature of the object and the temperature of the air, but first let me stick in the value k = and get the object temperature as T(t) = 20 + sin - cos :

Nice, eh? The object lags behind, in temperature, by ... it looks like about 4 hours. Now that's something I'd have some faith in, this lagging business ... and the object has temperature swings which are smaller than the surrounding air ... and the variations have a 24 hour period just like the air and ...

S: Okay, I get it, you want me to be impressed with what the math is saying. So, I'm impressed. But I'll tell you one thing: you say the object has temperature swings which are smaller than the air, but you assumed that the maximum temperature of the object was 23˚ so no wonder! If you had assumed the maximum object temperature was 30˚ then ...

P: Try it!

S: Okay, I'd want 20 + = 30 which means I'd get = 10 so I'd divide by 5 and

square and get = 4 so k2 = - 2 and ... uh, how's that possible? I mean, k2 is negative.

P: See how smart the math is? It's impossible for the object to have a temperature greater than the maximum air temperature ... and the math is telling you that.

S: It is? Do you believe it?

P: I'm not sure. It's a prediction based upon this particular set of assumptions ... like Newton's law of cooling, for example. We'd have to go out into the field and do some measuring to see if the math is ...

S: Aha! The math may be no good!

P: No, the math is always good ... it only does what it's told. You say "assume Newton's Law of Cooling" and it does. It's the physicist or engineer or biologist who may be no good at making reasonable assumptions hence generating a good MATHEMATICAL MODEL ... and that's what these DEs are ... just so-called mathematical models of how things behave and they or may not be any good.

**Example:** Water flows into lake Ontario at the rate of A _metres_ 3 _/day_ (from rivers and rain, etc. as well as from liquid industrial waste) and this water has an average pollution concentration of B _kg/metres_ 3. Water (mixed with pollutants) is also withdrawn at the rate of C _metres_ 3 _/day_. Describe, via a differential equation, the amount of pollutants at time t days (after measurements begin).

**Solution:** If P kg is the amount of pollutants in the water at time t, then the rate of change, measured in kg/day, is . But water enters at A _metres_ 3 _/day_ and has a pollutant concentration of B _kg/metres_ 3 so: ( _kg/day_ IN) = x = AB . Further, if the polluted water leaves at C we can find the ( _kg/day_ OUT) if we know the _kg/metres_ 3 concentration. If Lake Ontario has a volume of V _metres_ 3 and the lake contains P _kg_ of pollutants, the average concentration if = _kg/m_ 3. However, V may change with time. In fact, its rate of change is

= (A) _m_ 3 _/day_ \- (C) _m_ 3 _/day_ and, unless A = C the volume will change. We solve this world's simplest DE for V(t): = A - C hence V = (A - C) t + _constant_ (where the _constant_ of integration won't be called C else we'd get it confused with the C _m_ 3 _/day_!). If the lake initially (when measurements begin) has V0 _metres_ 3 of water (mixed with pollutants) then V0 = 0 + _constant_ and we conclude that the _constant_ = V0 and the volume of the lake is V(t)=(A-C) t + V0 _metres_ 3 at time t, so the average concentration of pollutants is = and we finally have our DE:

which (surprise!) is a linear first order DE, easily recognizable if we write it as: + P = AB.

S: I hope I don't see that on the final exam! Say, do you think we've ... uh, you've done a good job in finding a reasonable mathematical model?

P: Not really, but as a first try it'll give me an estimate which will probably be in the ball park. You see, I've used

(kg/day OUT) = hence I've assumed an "average" pollutant concentration of

= which is like saying that the places where lake water is removed are places where the pollutant concentration is average. Not likely. Besides, I've assumed that the only source of pollutants is from water which enters the lake ... maybe there are other sources. Besides, I've assumed that every day is like every other day whereas I might improve the model by considering the A, B, C etc. to change with time. Besides ...

S: Okay, I get the idea.

LECTURE 4

SEQUENCES AND SERIES

Sequences

PS:

P: What's this series add up to? 1 + + + + ...

S: I hope it's a geometric series ... I have a formula for that ... let's see, = = so the ratios are all the same so it is a geometric series, so it adds up to = = 2, right?

P: Why 2? Why not 3?

S: I have a formula, that's why!

P: If you told somebody it "adds up to 2" you'd mean ... what?

S: I'd mean ... if you added up all the terms you'd get the number 2.

P: Can you add up all the terms? Is that possible? Remember, there are an infinite number of terms.

S: You're trying to tell me something, right?

P: How about the series: 0 + 0 + 0 + 0 + .... ?

S: Easy! It adds up to 0.

P: Why? There are an infinite number of terms, that's ∞, and each has the value 0, so shouldn't it add up to (∞)(0) ... and just what does that mean?

S: Okay, you're trying to tell me we need a definition ... a precise definition, of "adding up a series", right?

Consider the series 1 + + + + ... where the first term is a = 1 and the _common ratio_ is r = and we want a definition of "the sum of an infinite series" so that, for this series, the sum would be = 2. If we added the numbers on a calculator the sequence of numbers appearing in the display window of the calculator would be 1 then 1 + = 1.5 then 1 + + = 1.75 then 1 + + + = 1.875 and so on. If we expected to get a sum after adding an infinite number of terms we'd expect these numbers appearing in the calculator display window to approach some limiting value ... in this case the number "2". That is the basis for our definition:

For the infinite series a1 \+ a2 \+ a3 \+ a4 \+ ... we construct the sequence of PARTIAL SUMS

S1 = a1, S2 = a1 \+ a2, S3 = a1 \+ a2 \+ a3, etc. and if Sn = L, then we say

the infinite series "sums to L".

In other words: as we add the terms on a calculator, the numbers appearing in the calculator display are the "partial sums" and they should have a limiting value if the infinite series is to have a SUM. If they don't, the infinite series has no sum.

For the geometric series above, where a = 1 and r = , it's fortunate that we have a formula for the sum of n terms: it's given by the formula Sn = a = which can be written Sn = 2 . Clearly Sn = 2 (since = 0) and we're happy that our definition agrees with this formula. For the series 0 + 0 + 0 + ... the PARTIAL SUMS are S1 = 0 and S2 = 0 + 0 = 0 and S3 = 0 + 0 + 0 = 0 and, indeed, Sn = 0 no matter how many terms we add. Hence Sn = 0 = 0 and ...

S: But that's exactly what I said!

P: Aah, but now you can prove it! Nobody can argue that "the sum of the series is (∞)(0) hence doesn't exist".

S: You called this lecture "Sequences". Is that what we're studying? The "sequence" of partial sums? And is that so you can invent a definition for the "sum of an infinite series" ... using this "sequence" stuff?

P: Partially, but sequences occur from time to time without being associated with an infinite series, so they're worthwhile in their own right. See?

S: No.

P: Pay attention.

Examples:

• The sequence An = A0 (1 + )n is the amount of money accumulated after n years, if $A0 is initially invested at i% per annum.

• If bacteria grows at 2% per day, then the sequence Bn = B0(1.02)n gives the amount after n days.

• "Newton's method" provides the scheme xn+1 = xn \- with n = 1, 2, 3, .... and x1 given, which generates a sequence which (sometimes) converges to a root of f(x) = 0.

• In investigating the growth of a rabbit population, Fibonacci* generated a sequence F1, F2, F3, ... (called the "Fibonacci numbers") satisfying Fn+2 = Fn+1 \+ Fn (each being the sum of the preceding two). They are: 1, 1, 2, 3, 5, 8, 13, ...

• The sequence of numbers an = n converges to the number e = 2.71828...

S: Hey! That's twice you've used the word "converges". What does ... ?

P: Okay, here's what we mean:

If Sn = L we say that the sequence {Sn} "converges to L".

We might say "this series converges" and "that one diverges" or maybe "here's a convergent series" and "there's a divergent series".

S: But that's just like saying that the limit exists, isn't it?

P: Yes, but it's more descriptive don't you think? If we're adding up the terms of an infinite series and we get the partial sums S1 then S2 then S3 and so on, it's nice to think of them as "converging" to some limiting value. It's a nice terminology, don't you think?

S: No ... it's just more for me to remember. Another thing, do I know that an = n converges to e? I mean, did you ever say that before? I mean ...

P: Let's do it. Sometimes it's difficult to establish the convergence of a sequence. For example, how would one prove that the sequence of Newton iterates, xn, actually converge to a root of f(x) = 0? Not easy! But this one is easy. We just take the limit of n as n∞. First write y = x so ln y = x ln and note that this has the form

(∞) ln (1) = (∞)(0) as x∞, so we couldn't use l'Hopital's rule so we rewrite it as so it now has the requisite form, so now we can apply l'Hopital, differentiating both numerator and denominator so the new ratio is which is just and now we let x∞ and get "1", see?

S: But you said the limit was "e"!

P: Oh, I forgot ... since ln y = x ln we've actually calculated the limit of ln y, so y itself has a limit of e1 = e. Nice, eh?

S: Shouldn't you say that "y converges to 1"?

P: Good idea, let's do that.

S: Another thing. I notice that you changed from "n" to "x". I mean, you actually calculated the limit of x rather than the limit of n. Was that necessary?

P: Well, "n" is an integer and I didn't like the idea of differentiating with respect to an integer variable. After all, the definition of the derivative (remember?) is the limit of as n0 and it's hard to imagine n as a tiny increment in a variable which can only have integer values. Can an integer change by, say, .001 and still be an integer? Hardly. So I changed to x which is a nice name for a continuous variable and then I ...

S: Come on, you just changed the problem, didn't you? You wanted the limit of one thing so you changed it and found the limit of something else. That's cheating, isn't it?

P: Pay attention. I'll plot a point n for various values of the integer n, then I'll plot x and you'll see that the curve passes through each point ... what else? To find the limit of n as n∞, we just have to find the limit of x as x∞ ... and that's just what we did.

S: But you could have kept the "n" ... you didn't have to change it to "x", did you? I mean, you'd get the same answer, right?

P: It just makes me feel better to call the variable "x". I really get nervous differentiating with respect to a variable which ...

S: Yeah, I know, with respect to an integer. Okay, I wouldn't want you to get nervous. Can we go on? |

Plot of the sequence n

---|---

SERIES

Should we run into an infinite series, a1 \+ a2 \+ a3 \+ ... , and it should happen to be a geometric series, then we'd just check to see if the common ratio r has an absolute value less than 1 (i.e. | r | < 1) and if so, we can "sum the series" using . That is, the infinite series converges to . Mother Nature is rarely so accommodating. You're more likely to run into a series for which you do NOT have a formula for the sum ... so you'd just have to start adding terms and pray that the partial sums have some limiting value.

**Example** : Calculate the sum of the series 1 + + + + + ...

**Solution:** We'll ask   to do the calculations. We'll start off by asking for 6 digit accuracy, then we'll define the terms: t(k) = , then we'll find the **sum** for k = 1 to 10 (i.e. 10 terms), then the **sum** for k = 1 to 100 (100 terms) and so on, and each time we'll ask   to evaluate the **sum** as a decimal (or floating point number) using the **evalf** command ... then we'll watch for the partials **sum** s approaching some limiting value.

Here we go:

• Digits:=6;

Digits := 6

• t(k):=1/k;

t(k) := 1/k

• evalf(sum(t(k),k=1..10));

2.92897

• evalf(sum(t(k),k=1..100));

5.18738

• evalf(sum(t(k),k=1..1000));

7.48548

• evalf(sum(t(k),k=1..10000));

9.78761

• evalf(sum(t(k),k=1..100000));

12.0901

• evalf(sum(t(k),k=1..1000000));

14.3927

• evalf(sum(t(k),k=1..10000000));

16.6953

• evalf(sum(t(k),k=1..100000000));

18.9979

• evalf(sum(t(k),k=1..1000000000));

21.3005

• evalf(sum(t(k),k=1..10000000000));

23.6031

• evalf(sum(t(k),k=1..100000000000));

25.9056

• evalf(sum(t(k),k=1..100000000000000));

32.8134

S: Hold on! You've just added 100,000,000,000,000 terms and you're getting nowhere. How long do we have to wait?

P: Who knows ... but let's keep going. Maybe this series converges very slowly. Maybe it takes jillions of terms before we see ...

S: Or maybe we won't live that long. Can't you just tell me the answer? I mean, maybe it doesn't even have a sum ... ever think of that? Maybe it just keeps getting bigger and bigger and ...

P: Aah, you've been taking your smart pills again. What we need is some way to tell if it actually has a sum. If so, we'd just keep going. If not, we wouldn't even bother starting. Do you recognize this series?

S: No. Should I?

P: We've run into it before. It's called the HARMONIC SERIES. In fact, I think I said ... in fact I proved, that

by considering the area under the curve y = from x = 1 to x = n. Remember? And do you know what that means? I'll tell you. It means that the values of Sn = 1 \+ + + + ... + are very much like ln n, when n is large. In fact, the ratio is very nearly "1". In fact ...

S: So let's see   calculate ln n ... just to check it out.

P: Good idea. We'll just evalf the logs of n for n = 10 then 100 then 1000 and so on.

S: How does   know we're talking about natural logs?

P: Because   has taken this course and knows that ...

S: Let's get goin'.

P: Okay, but watch loge(n) and compare with Sn computed above and convince yourself that they get closer:

• evalf(log(10));

2.30259

• evalf(log(100));

4.60517

• evalf(log(1000));

6.90776

• evalf(log(10000));

9.21034

• evalf(log(100000));

11.5129

• evalf(log(1000000));

13.8155

• evalf(log(10000000));

16.1181

• evalf(log(100000000));

18.4207

• evalf(log(1000000000));

20.7233

• evalf(log(10000000000));

23.0259

• evalf(log(100000000000));

25.3284

• evalf(log(100000000000000));

32.2362

LECTURE 5

CONVERGENCE of SERIES

**A Test for Convergence of an Infinite Series** **:**

Suppose we are given an infinite series of positive terms: a1 \+ a2 \+ a3 \+ ...+ an \+ ... (where we call the nth term an) and suppose we test it, to see if it's a geometric series. We'd take the ratio and and so on, to see if they were all the same (since that'd make it a geometric series). Suppose they were NOT the same (so it's NOT a geometric series and we're unlikely to have any formula for the partial sums!) BUT we notice that each ratio is less than the number . That is, < and < and, in general, < for n = 1, 2, 3, and so on. That means **a** 2 **< a**1. Further, a3 < a2 and that means **a** 3 **<** 2 **a** 1. Also, a4 < a3 means that **a** 4 **<** 3 **a** 1 and, in general, an+1 < an means that **a** n+1 **<** n **a** 1. Hence, the given series, namely a1 \+ a2 \+ a3 \+ ... is less, term-for-term, than the series **a** 1 **+a** 1 **+** 2 **a** 1 **+** 3 **a** 1 \+ ... which (surprise!) is a geometric series with first term a1 and common ratio . Hence, the partial sums of the given series, a1 \+ a2 \+ a3 \+ ... can't possibly become infinite (as they did for the HARMONIC SERIES), because they are always less than the partial sums for the geometric series a1 +a1 +2 a1 \+ 3 a1 \+ ... and the partial sums of this series are less than the sum of the _infinite_ geometric series which is given by: = = 2 a1. Hence the given series would converge to some limit.

S: Hold on! Are you saying that just because the partial sums for a1 \+ a2 \+ a3 \+ ... can't get bigger than 2a1, then they automatically have a limit? I mean, that sounds like hand-waving to me. I mean ...

P: Yes, that's what I'm saying. Let's consider the partial sums of some series like a1 \+ a2 \+ a3 \+ ..., so S1 = a1 and S2 = a1 \+ a2 and S3 = a1 \+ a2 \+ a3 and so on. We'll plot a graph of Sn versus n and note two things about the graph: first, the sums are increasing, because we keep adding positive terms ...

S: Who said they were positive?

P: Oh, did I forget to mention it? For now we'll only consider series where every term is positive.

S: Now he tells me.

P: Okay, first we notice that Sn is an increasing function of n and we also notice that Sn is never larger than 2 a1. The graph might look like this ===>>>

In the graph I've assumed that the partial sums, Sn, are never larger than some number B. I don't want to use 2a1 because you'll think that it always turns out to be twice the first term!

S: Not me! |

---|---

P: Okay, here's the theorem: if a sequence is increasing but never gets larger than some number (call it B, for "upper Bound") then the sequence converges. That's it! Nice theorem, eh?

S: Are you going to prove it?

P: No, but you must admit ... just by looking at the graph ... that it seems a reasonable theorem. After all, if the graph keeps increasing but can't get larger than B then it must level out and approach some limiting value. See? The theorem just validates common sense thinking. Anyway, we're onto a test for convergence of an infinite series so we shouldn't get sidetracked. But first I'll make a fuss about this new theorem:

If the sequence {Sn} is increasing, but has an upper bound (that is, Sn ≤ B for some number B), then Sn exists

... that is, the sequence converges to a limit.

Note that we use the notation {Sn} to denote the sequence of numbers: S1, S2, S3, ...

It's better than referring to "the sequence Sn" because Sn really refers to the nth member of the sequence.

There's another analogous theorem:

If the sequence {Sn} is decreasing, but has a lower bound (that is, Sn ≥ C for some number C), then Sn exists

... that is, the sequence converges to a limit.

**Example:** Show that the sequence converges as n∞.

**Solution:** We want to show: (1) the sequence is increasing, and (2) it's bounded above. Let an = and consider = = = > 1 so an+1 > an hence the sequence is increasing. However, an = < = 1 so the sequence is bounded above by the number "1". We conclude that the sequence converges.

S: That's the stupidest thing I ever heard of. I mean, I'd just take the limit: = 1. That'd mean it converges, right?

P: Sure, but I wanted to demonstrate this new theorem. But let me do one where you can't just find the limit (else, as you say, you certainly wouldn't use these theorems ... you'd just take the limit).

**Example:** Show that the sequence converges.

**Solution:** Let an = and consider = = < provided n > 10. Hence, past the 10th term, the sequence is decreasing. To use the second theorem (above) we need only find a lower bound. But every term is positive so an > 0. Hence the sequence {an} is decreasing _and_ bounded below by 0 ... hence it converges.

We should get back to what we were saying earlier: if the ratio of successive terms of a series is less than those of a convergent geometric series (where the common ratio satisfies | r | < 1), then the infinite series will converge (and, in fact, it will converge to a number less than the sum of the infinite geometric series!).

**Example:** Show that the series converges.

**Solution:** If we call the terms a1, a2, etc., then we have a1 = = 10 , a2 = = 50, a3 = = and so on. Consider the ratio = = < provided n > 10. Hence, after the 10th term of the series, the terms are less than the terms of a geometric series with common ratio and since this ratio is less than "1", the geometric series converges ... hence the given series converges. We'll make a fuss about this test:

THE RATIO TEST

The series = a1 \+ a2 \+ a3 \+ ... (of positive terms) will

converge if = L < 1, and will

diverge if = L > 1

Note that the first part is reasonable: if the ratio has a limit less than 1 then the partial sums are less than those of a convergent geometric series, hence they are bounded above, hence these partial sums do have a limit and that's exactly what we mean by "converges". The second part is trickier: if the ratio has a limit greater than 1 then we _could_ conclude that the partial sums are greater than the partial sums of a certain geometric series whose partial sums actually become infinite ... so the series we're considering has partial sums which must also become infinite ... so the series diverges. Although we _could_ argue in this manner, it's easier to regard the second part of the ratio test as saying that the terms are getting LARGER hence cannot have a limit of ZERO hence the series fails the nth term test, hence diverges.

S: Wait a minute! You were doing this example, and after the 10th term you're okay, but what about the first 10 terms? You did that twice ... once with a sequence and once with a series.

P: It doesn't matter what happens for the first 10 or 10,000 terms. For the sequence, I want to show that the sequence has a limit so I just consider the terms beyond the 10th. These DO have a limit as n∞ (using the theorem), so preceding these by a few terms doesn't change that fact. After all, we're interested in n∞ so what does it matter when n = 1 or 2 or 3 or even n = 10,000? For the series, I just have to show that the partial sums are bounded above. They surely increase because each term is positive, so I'm left with finding an upper bound. That I do by saying

a1 \+ a2 \+ a3 \+ ... < a1 \+ a2 \+ a3 \+ ... + a10 \+ a10 + 2a10 + 3a10 + 4a10 \+ ....

where I've shown that every term past the 10th decreases by at least so the series on the right (which is my upper bound) is no greater than a1 \+ a2 \+ a3 \+ ... + a9 \+ = a1 \+ a2 \+ a3 \+ ... + a9 \+ 11 a10. See? I sum all the terms of the geometric series and get an upper bound for my series. Nice, eh? And it didn't matter whether I started at the 10th term or the 10,000th term. See? Not only that, I can now say that converges to a number less than

a1 \+ a2 \+ a3 \+ ... + a9 \+ 11 a10 which is + + + ... + + 11 .

S: And how big is that?

P: I leave it as an exercise for ..

S: For the student ... I know, I know. Anyway, I find this very confusing. Sequences, series ... they all look the same to me. In fact I can hardly tell the difference between the last two examples. One has and the other does too. That's confusing. I mean ...

P: A sequence is a bunch of numbers separated by COMMAS: a1, a2, a3, and so on. A series is a bunch of numbers separated by PLUS signs: a1 \+ a2 \+ a3 \+ ... See? You're adding them, not just inspecting them! If the sequence {an} approaches a limit of, say, 47 (meaning that an = 47 ) then we'd say that the sequence CONVERGES. However, the series a1+a2+a3\+ ... would NOT converge if an = 47. In fact, after adding a few million terms each would look very much like the number 47 so your series would look like ... + 47 + 47 + 47 + ... which certainly doesn't converge -- in fact it becomes infinite! See?

S: But if the terms had a limit, any limit like 0.1, your series would still look like ... + 0.1 + 0.1 + 0.1 + ... after a while. Right? And then it would become infinite. Right? Then it would diverge, right?

P: Very good! And that's just what happens unless ... unless what?

S: Huh?

P: If an = L then the series a1 \+ a2 \+ a3 \+ ... would look like ... + L + L + L ... after a while, so what value must L have in order to avoid getting infinity for the sum?

S: I haven't the foggiest.

P: How about L = 0? Don't you see? Unless an = L = 0 the series couldn't possibly add to anything but infinity.

S: So if you give me a series and an isn't zero, then the series won't converge, right?

P: Very good! Keep eatin' those smart pills. And we should make a fuss about that because it's probably the easiest test to apply:

the nth term test

The series = a1 \+ a2 \+ a3 \+ ... will diverge if an ≠ 0

**Example:** Test the series for convergence.

**Solution:** We check to see if an = = 0; if not, the series DIVERGES (meaning it doesn't converge). We can use l'Hopital's rule since has the form as n∞. Differentiating both numerator and denominator with respect to n we get the ratio which still has the form so we continue and get

and one more time gives which has a limit of ∞ as n∞. Since an = ∞ (rather than the required an = 0), the series diverges.

S: I though you didn't like to differentiate with respect to an integer. I thought you get nervous when ...

P: I changed my mind. I've decided that I'd just think of "n" as being a continuous variable and do the differentiating without changing its name to "x". Good, eh?

S: Not very. Anyway, I have a question you'll have a problem with. You already said that the Harmonic Series 1 + + + + ... adds up to infinity and that means it diverges, right? Yet the terms are an = so an = 0 so the series should converge. How do you like them bananas?

P: I never said that an = 0 will make a series converge. When did I say that? Pay attention. What I said was

. That's what I said.

S: Aren't they the same? I mean ...

P: If an animal doesn't have four legs, it's NOT a horse. Got it? Now suppose it does have four legs. Is it a horse? Is that a test for a horse? Maybe it's a cow or ...

S: Huh?

P: If an ≠ 0 then the series is NOT convergent. That's the theorem. If an = 0 that's not a test for anything. Maybe the series converges (like 1 \+ + + + ... = , a geometric series which converges to "2") or maybe it diverges (like the harmonic series 1 + + + ... = which diverges to infinity).

Remember, if an = 0, anything can happen.

**Example:** Does the series = + + + ... converge or diverge?

**Solution:** We could check to see if an = 0 with an = , but we couldn't use l'Hopital's rule because we have no way of differentiating the denominator, n! = (1)(2)(3)...(n).

S: Hah! So there's a case where you can't consider "n" to be like an "x" ... a continuous variable as you call it. I mean, what's (x!) if x isn't an integer. I think you're stuck there!

P: Pay attention. I'm going to use the RATIO TEST.

We look to see if the series is less than a convergent geometric series (with common ratio less than "1"), so we consider the ratio = = 2which has a limiting value of 0 (i.e. = 0) which is certainly less than "1" so the series does indeed converge to some limit.

S: And what's that limit?

P: I don't know, but I can now start to add the terms and be guaranteed that they'll add up to something.

S: Are you saying that you're just interested in proving that it adds up to something? Somebody gives you the series and you say "yes, it adds up to something" and you're finished with the problem? Is that what mathematicians do? I mean ...

P: Okay, you've got a point. The ratio of terms is = 2≤ 2= when n ≥ 10 so, from the 10th term onward, the series is less than a10 \+ a10 + 2a10 + 3a10 \+ ... = = a10 (using , the sum of an infinite geometric series) so our series is less than:

a1 \+ a2 \+ a3 \+ ... + a9 \+ a10 or .

S: And what's that?

P: It's ... uh,   says it adds up to about 44.344, to 3 decimal places.

S: How close is that to the right answer?

P: Well, let's see ... the sum of the series is less than 44.344 and greater than the sum of just the first 9 terms which adds up to ... uh,   says + + + ... + = 44.298 (to 3 decimal places). I conclude that

44.298 < < 44.344 and that's a pretty good estimate, eh?

S: Yeah, pretty good, but suppose I wanted the answer to more decimal places ... maybe 6 decimal places. What then?

P: See what we did? We said the sum was greater than the sum of the first 9 terms and less than this sum PLUS the sum of a certain geometric series. If we wanted greater accuracy we'd just keep going past 9 terms to maybe 20 or 30 or 100 terms ... until the geometric series changed the sum by less than, say 10-6.

S: Do it.

P: You do it!

S: Uh ... well ... I'd look at that ratio again: = 2and if I choose a really big "n" then it'd be pretty small, less than some number "r" and ... let's see ... how small do I want r to be? That extra geometric series would look like an \+ r an \+ r2an \+ ... which adds up to and I'd want that to be less than 10-6 ... I guess. How'm I doing boss?

P: Keep going, you're doing fine.

S: Okay, I'd want ... uh, I'd want ... I think this is too tough, don't you? I should try a bunch of n's and see if I'm there yet. I mean, less than 10-6.

P: That's a good idea. The ratio = 2approaches zero as n∞, so we can make it less than a really really small "r" so when you want to be less than 10-6 it's something like asking an itself to be less than 10-6 because, after all, is pretty close to an if r is very very small. Okay, how big must n be so that an = < 10-6 or maybe we should ask < 10-7 just to be on the safe side. How big?

S: I haven't the fogg..

P: Let's take logs. We'd be asking that log< log(10-7) = -7 log(10) and this time we'll actually pick logs to the base 10. Nice, eh? Then we'd want log(n2) + log(2n) - log(n!) < -7 (since log10(10) = 1). We write log(n!) = log= log(1) + log(2) + ... + log(n) and that'd mean we need 2 log(n) + n log(2) - (log(2) + ... + log(n)) < -7 or . Now we look at a table of common logs (to the base 10) and keep adding the logs of the integers until their sum exceeds 7 + n log(2) + log(n). See? That'd give you a value of "n" and the sum of the series would lie between the sum of the first n terms and this sum PLUS that extra geometric series.

S: I'm sorry I asked ... let's forget the whole thing. I know that won't be on the final exam.

P: Well, at least you've learned something. You've learned that you can get estimates of the sum of an infinite series (provided it converges) and you've even got a couple of tests so you can determine if it does converge and ...

S: Yeah, I've learned something ... but I don't know what good it'll do me, except to pass a final exam. I mean, does this stuff have any useful applications ... outside of mathematics?

P: Patience.

LECTURE 6

ALTERNATING SERIES and ABSOLUTE CONVERGENCE

ALTERNATING SERIES

We saw that the HARMONIC SERIES 1 + + + + ... diverges to infinity. We proved this by comparing the partial sums S1 = 1, S2 = 1 + , S3 = 1 + + , etc. with the area under the graph of y = from x = 1 to x = n, namely = _ln_ n  ∞ as n∞ and using this we showed that Sn increased without limit ... so the series diverged. Remember! In order for a series to converge, the partial sums must have a limiting value. (That's the definition of "convergence".) We could also have tried the nth term test: If an ≠ 0, the series diverges. This, too, will guarantee that the partial sums will NOT have a limit. However, an = = 0 so this test fails to give any information about the convergence or divergence of the harmonic series.

S: But we already know it diverges, so why are you still applying tests?

P: I want to demonstrate that the nth term test and the ratio test sometimes give no information at all about the series. I also want to remind you that the crucial point is that the partial sums must have a limiting value as n∞, else the series diverges. Pay attention and I'll get to the point.

Had we tried the RATIO test we'd consider the ratio of successive terms: = = and we'd get

=1 which is NOT less than 1, hence the partial sums Sn, although increasing (since the terms of the harmonic series are all positive), are NOT bounded above by a convergent geometric series, so the RATIO test also gives no information. **Remember this!**

and **remember this!**

Okay, I just wanted to make these points before we went on.

S: Just wait a minute! You goofed! For the harmonic series, the ratio = = is less than 1, so the RATIO test says the harmonic series converges, right?

P: Wrong! It is the limiting value of the ratio which must be less than 1 and the limiting value in this case is NOT less than 1. In fact, now that you've mentioned it, it's a good example which shows that even if the terms get smaller (meaning < 1) the series might still diverge. The critical thing is that the terms must get smaller fast! For the harmonic series, they don't get small enough fast enough. After all, the 100th term is only 1% smaller than the 99th term and the 1000th term is only 0.1% smaller, so the terms decrease in size very slowly. On the other hand, for the geometric series 1 + + + ... the 100th term is 50% of the 99th term and the terms decrease very rapidly and after a while they're microscopic in size. Anyway, I wanted to consider a different kind of series where just getting smaller is (almost) enough to guarantee convergence.

S: Wait a minute. That "small enough fast enough" sounds familiar. Haven't we done that before?

P: Yes, you've got a good memory. When we talked about "improper integrals" of the form we said that in order for the integral to "converge" (see? it's the same word!) the function f(x) has to get small enough fast enough. In fact, an improper integral like that is very much like an infinite series. Remember, is really a of terms ... a Riemann SUM ... much like an infinite series.

In fact, if we use the notation f(1) + f(2) + f(3) + ... for our infinite series, rather than a1 \+ a2 \+ a3 \+ ... (where f(n) gives the value of the nth term of the series), then we can associate the partial sums Sn = f(1) + f(2) + ... + f(n) = with an area, as shown ===>> |

---|---

S: Didn't we do something like that when we talked about ... uh, what was it?

P: The HARMONIC SERIES. Yes, we did, but now I want to talk about a related series where it's (almost) enough for the terms to get small. They don't have to get small fast. Pay attention.

the Alternating Harmonic Series

Consider the infinite ALTERNATING HARMONIC SERIES: 1 - + - + - + - ... where the terms alternate in sign. The partial sums are S1 = 1, S2 = 1 - , S3 = 1 - + , etc. Let's plot these partial sums to see if there is any chance that they approach a limit as n∞.

We compute and plot S1, S2, S3, S4, etc. and note that the partial sums Sn oscillate, the amplitude of the oscillations becoming less and less. In fact, if we call the terms a1, a2, a3 , etc., (meaning the _magnitude_ or _absolute_ _value_ of the terms, so a2 = 1/2, not \- 1/2), then starting with S1 = a1, we subtract a2 (which is less than a1) then add a3 (which is less than a2) then subtract a4 (which is less than a3) and so on. In other words, when we go down (by subtracting) then we go up less than we went down. When we go up (by adding a term) we then go down less than we went up. This is characteristic of any alternating series if the terms get smaller in magnitude ... and for such a series the partial sums oscillate and often converge to some limiting value. In the case of the alternating harmonic series, the partial sums (hence the infinite series) do converge to a number which is roughly .7 (as indicated on the diagram).

S: They "often" converge? Is that what you said? "Often"? How often?

P: Patience.

Consider the series - + - + - + - ... which is certainly "alternating" (since the terms alternate in sign) and the terms do get smaller in magnitude (or absolute value) since > > > > and so on. Yet this series doesn't converge. To see that, let's look at the partial sums, graphed:

The partial sums don't have a limiting value. In fact, the terms of the series approach 1 in magnitude

(even though they're getting smaller) so, after a while, the series begins to look like .... + 1 - 1 + 1 - 1 \+ - ... and just oscillates without converging to a limit ... hence the series diverges. Clearly we need something more than simply "an alternating series where the terms get smaller". We also need ...

**S:** I know! The terms ... uh, the terms ... uh ... I thought I had it.

**P:** The terms must decrease to zero! And that is the test for convergence of an alternating series:

the ALTERNATING SERIES TEST

The alternating series = a1 \- a2 \+ a3 \- a4 \+ - ...

converges if the terms an decrease to zero.

Note several things:

• In writing a1 \- a2 \+ a3 \- a4 \+ - ... we assume that the a's are positive and the alternating sign is shown explicitly in front of each term. (If a1 = 1 and a2 = -1/2 and a3 = 1/3 etc. the series would NOT be alternating, but would, in fact, be the harmonic series ... so when we write a1 \- a2 \+ a3 \- a4 \+ - ... we're assuming each of a1, a2, a3, etc. are positive.)

• The phrase " **decrease to zero"** means two things:

and and they decrease to **zero**.

• This test is perhaps the easiest test to apply ... so you have to hope that you never run across series other than "alternating" ones!

S: What if the test fails? I mean, your other tests failed ... sometimes: the nth term test and the RATIO test.

P: Well, if an ≠ 0, you know the answer, don't you?

S: Huh?

P: That's the nth term test: if an ≠ 0 then the infinite series definitely diverges.

S: You mean that test holds for alternating series too?

P: Yes, it holds for every infinite series, not just ones with positive terms or ones where the terms alternate in sign. For example, consider the series + - + + - + + - + + - ... where there are two positive terms then a negative term and so on. It's not alternating and the terms certainly aren't all positive, yet it diverges. Why?

S: I haven't the fogg ... uh, wait ... the terms don't have a zero limit. Right?

P: Right. The limiting value of the terms, namely |an | = , is 1 not 0. Hence we conclude that the series diverges.

S: Will I have to deal with series like that?

P: I'll only expect you to test series for which you actually have a test ... and that says it all.

S: Hmmph.

P: Let me tell you something else about alternating series ... ones in which the terms decrease to zero (so you know the series will converge). This is really nice so pay attention:

Estimating the Sum of a Convergent Alternating Series

If we look again at the graph of partial sums of the convergent alternating harmonic series we note some interesting things:

• The limiting value of the partial sums (hence the sum of the infinite series) is less than S1 and greater than S2

• In fact, the partial sums are alternately greater than, then less than, then greater than ... the sum of the infinite series. That means that we can stop adding & subtracting terms and the sum of the infinite series will automatically lie between the last two partial sums we computed.

We make a fuss about this:

For = a1 \- a2 \+ a3 \- a4 \+ - ..., if the terms an decrease to zero, then

a1 \- a2 \+ a3 \- a4 \+ - ... - a2n ≤ ≤ a1 \- a2 \+ a3 \- a4 \+ - ... - a2n +a2n+1

In words:

The sum of an infinite alternating series (whose terms decrease to zero)

lies between successive partial sums

(Note that the alternating series must have terms which **decrease to zero**.)

**Example:** 1 - + - ≤≤ 1 - + - + and

1 - + - + - ≤≤ 1 - + - + - + and so on.

S: I don't get it. Why do you need this decrease to zero stuff? Besides, you didn't answer my last question ... not all of it. I asked what if the test fails ... I mean the alternating series test. What if the terms don't get smaller? Oops, wait, I think I know. If they don't have a limit of zero then the series diverges ... so they have to have a limit of zero, in which case they must get smaller too, right?

P: Wrong! It's possible to have an alternating series where an 0 yet the terms aren't continually decreasing.

S: What! That's impossible ... isn't it? I mean, if they don't decrease, how do they get to zero?

P: You can recognize the plot of partial sums of an alternating series by the fact that they oscillate. You can also recognize that the series converges by the fact that the partial sums have a limiting value as n∞. Now, consider the following plot: it's clearly that of a convergent alternating series, and the terms have a limit of zero (since the oscillations die out), YET the terms are NOT decreasing. In fact, I got the plot simply by repeating the Vv pattern, but decreasing the size |

---|---

somewhat each time. Note that the first dip gets us to S2 = a1 \- a2 (i.e subtracting a2 from a1) then we add a3, subtract a4, add a5 then comes a big subtraction, namely a6, to begin the Vv pattern again. Clearly a6 > a5 so the terms are NOT decreasing.

S: Whoa! I don't think such a series exists, do you? Besides, sometimes the terms get smaller and sometimes they don't, right?

P: Right, but the alternating series test is NOT satisfied because the test requires that the terms form a DECREASING SEQUENCE of numbers: a1 ≥ a2 ≥ a3 ≥ a4 ≥ etc. etc. and that's not the case here, yet the series converges. As for the series existing, it should be pretty easy to invent such a series. Let's try 1 - + which gives us the first V, then we continue with - + which gives us the second, smaller v, then we continue with all terms decreased by a factor and do the Vv bit again, then decrease by again, and so on. Our series would look like: 1 - + - + - + - + and so on. See? That's the series, it converges but the alternating series test doesn't apply because the terms don't continually decrease.

S: I think the alternating series test is a lousy test.

P: It's not so bad. Remember this! A series will converge or diverge depending upon what EVENTUALLY happens to the terms. If the first few thousand terms don't satisfy any test, but thereafter the terms DO satisfy some test for convergence, then the series will converge. Let's write this out big:

A series = A1 \+ A2 \+ A3 \+ ... will converge (or diverge) if the terms

eventually satisfy some test for convergence (or divergence).

For example, if a series starts off not being alternating but becomes alternating after the first million terms, and if the terms thereafter decrease to zero, then the series will converge.

S: That seems pretty fishy to me. I mean ...

P: Okay, pick a convergent series.

S: Who? Me? Uh ... I pick 1 + + + + ...

P: Okay, now I'll place a million terms in front of your series, say 1 + 2 + 3 + .... + 1,000,000. Do you think I can make the resultant series diverge? Certainly not! My terms add up to something huge, but it makes no difference. Eventually we get to your series and they determine whether the infinite series converges. I've just added a constant to the sum of your series. I can't make it diverge. If you wanted to prove convergence, you'd just march past my terms and use the ratio test on your series. See?

S: I guess.

**Example:** Test the following series for convergence.

(a) (b) (c)

Solution:

(a) It's an alternating series so we apply the alternating series test:

First note that: an = =  = = 0. Further, we must show that the terms decrease, either by showing that = ≤ 1 or by showing that an \- an+1 = - ≥ 0. We'll do the latter. Bringing to a common denominator and simplifying gives an \- an+1 = > 0 for n ≥ 1. Hence the alternating series converges.

(b) We'll use the ratio test (in fact we ALWAYS use the ratio test if factorials are involved): = =  0 as k∞ and since the limiting value of the ratio is less than 1, the series converges. (In fact, it converges to the number e.)

(c) is an alternating series a1 \- a2 \+ a3 \- + ... with ai = . We'll first see if the terms decrease by considering , to see if it's less than 1. (Last time we considered ai \- ai+1 to see if it was positive). This ratio is = 3  3 as i∞ so the terms are eventually increasing! Since ai ≠ 0, the series diverges by the nth term test (which should, here, be called the ith term test!) or by the ratio test (since the limiting value of the ratio is greater than 1).

ABSOLUTE CONVERGENCE

We consider one final test for convergence, just in case the series doesn't have _only_ positive terms ... or maybe it's not even alternating. We consider the infinite series A1 \+ A2 \+ A3 \+ ... = where the terms An may be of any sign. We state without proof the following:

If, then converges.

In fact, replacing every term by its absolute value will generate a series whose partial sums are larger (or, at least just as large). The resultant series then has only positive terms (or at least non-negative) so we can apply the RATIO test. If the series of absolute values converges, so will the original series. In fact, the original series is said to be .

**Example:** Test the series for convergence.

**Solution:** We consider the series of absolute values: to which we apply the RATIO test:

= =  0 as n∞ and since this limiting value of the RATIO is less than 1, the series converges, hence the original series converges as well ... and, indeed, converges absolutely.

S: Whoa! What's all this about "absolutely convergent"? If it's convergent then it's absolutely convergent ... is that it? It's like saying "If I'm certain then I'm absolutely certain", right? "If it's a horse, then it's absolutely a horse". "If ...

P: No, no, no! Here, the word "absolutely" means we're considering a series where every term has been replaced by its "absolute value". Now we can study this modified series, since every term is now positive and we can use the RATIO test, and if we can show that this modified series converges, the original series will, too. In fact, the original series is said to "converge absolutely". Do you see that?

S: Not really.

P: Okay, let's do some examples.

**Example:** Test = - + - + ... for convergence and/or absolute convergence.

**Solution:** If every term is replaced by its absolute value we get + + + ... = which converges by the RATIO test (the ratio of successive terms is constant at < 1 ... or we can just recognize it as a geometric series with common ratio which is less than 1). Hence the original, alternating series not only converges, it converges absolutely.

**Example:** Test for convergence and note if it is absolutely convergent:

Solution:

For this series the terms satisfy an = = 0 hence satisfy the ALTERNATING SERIES TEST, so the series converges.

S: Oh yeah? You forgot to show that the terms are decreasing. You just showed that they have a limit of zero and you said that wasn't enough and you ...

P: Okay, okay. I could show that they decrease in many ways. First I could consider an+1 \- an and show that this was ≤ 0 (and that'd mean that an+1 ≤ an) or I could consider and show that this was ≤1 (and that'd also mean that an+1 ≤ an), but this time I'll do it differently. Pay attention. I'll consider the graph of y = . When x = 1 or 2 or 3 and so on, I'd get the terms in the series, . To show that these terms decrease in size I just have to show that the graph is decreasing, so I consider = = ≤ 0 when x ≥ 1 (as is the case if x = 1 or 2 or 3 etc.) Nice, eh?

S: But you haven't checked it for absolute convergence, and the problem said "note, which, if any ..."

P: That's because you interrupted, so let me continue.

The series of absolute values is and we could try the nth term test to see if ≠ 0. Unfortunately, this limit is zero, so the test fails to give us any information. We could also try the RATIO test to see if < 1, but, unfortunately, = = = (1)(1) = 1 so the RATIO test fails as well. In fact, we have NO test to apply to this series to see if it converges, hence we cannot determine whether the original series converges absolutely.

S: Are you telling me that you don't know whether it converges absolutely?

P: No, I'm telling you that none of the tests we've considered will work on the series .

S: So, you don't know, right?

P: As a matter of fact I do know. The series diverges and so the original series converges, but NOT absolutely.

S: I think you're guessing, right?

P: No, I'm not guessing ... the terms in , namely , look very much like = when n is very large and that means the series looks very much like the series which is the harmonic series and I know that it diverges, so the series diverges as well. In fact, I might as well add that to the collection of tests we've assembled because it's one of the most useful. But first I should explain what I mean by "the terms of this series look very much like the terms in that series".

Suppose we're given a series of positive terms: a1 \+ a2 \+ a3 \+ ... = and, for large values of n, the numbers an look like the numbers bn in the sense that = 1 (their ratio approaches a limiting value of 1). Then if converges, so will and if diverges, so will .

the COMPARISON TEST

If = 1, then the series and have the same convergence properties:

if one converges or diverges, so will the other.

**Example:** Test the following for convergence:

(a) (b) (c)

Solutions:

(a) The terms an = "look like" bn = = in the sense that = = 1 and since the series = is the divergent harmonic series, the given series also diverges.

(b) The terms an = "look like" bn = since = 1 as n∞, and since is a convergent geometric series, the given series converges absolutely.

(c) The terms an = sin "look like" bn = since = = = 1 (where we evaluate the limit by putting t = ), and since is the divergent harmonic series, the given series diverges as well.

**Example** : Test the following series for convergence and/or absolute convergence.

(a) (b)

Solution:

(a) The terms an = satisfy an = 0 (we'd have to show this ... it's not enough to simply state it!) and they decrease (we'd have to show this too!) so the series converges by the ALTERNATING SERIES TEST. However, even if we were to pause and prove these statements we'd _still_ have to consider the series of absolute values so we should really consider this first, because if it converges, so does the original series (and we'd avoid having to prove the original series satisfies the alternating series test!). Okay, we'll use the RATIO test and get = =  = 0 as n ∞, hence this series converges (since the limit of the ratio is less than 1). We conclude that the original alternating series also converges ... and it converges absolutely.

S: Whew! I find this very confusing. I mean, there are so many tests ... and I wouldn't know which to use, even if I could remember them all ... and ...

P: Okay, let's review them all. That's important because we're going to use all of them in the next lecture. In fact, it's the next topic that really makes series important. First, let's collect the tests:

REVIEW

For the infinite series a1 \+ a2 \+ a3 \+ a4 \+ ... we construct the sequence of PARTIAL SUMS

S1 = a1, S2 = a1 \+ a2, S3 = a1 \+ a2 \+ a3, etc. and

if Sn = L, then we say the infinite series "converges to L".

THE RATIO TEST

The series = a1 \+ a2 \+ a3 \+ ... (of positive terms) will

converge if = L < 1, and will diverge if = L > 1

(If L = 1, the series may or may not converge)

**the n** th **term test**

The series = a1 \+ a2 \+ a3 \+ ... will diverge if an ≠ 0

(If an = 0, the series may or may not converge)

the ALTERNATING SERIES TEST

The alternating series = a1 \- a2 \+ a3 \- a4 \+ - ...

converges if the terms an **decrease to zero**.

The sum of an infinite alternating series (whose terms decrease to zero)

lies between successive partial sums

If, then converges.

the COMPARISON TEST

If = 1, then the series and have the same convergence properties:

if one converges or diverges, so will the other.

Given an infinite series, you could use the following scheme:

S: Wheee! That's great ... uh ... is it? I mean, now that I look at it, it seems more confusing than ever. I mean, what if ... uh ... do I have to go in that order, or ...

P: It's just a suggestion, this scheme. Sometimes you recognize a geometric series right away and you needn't even look at the chart. Sometimes you have terms like which look like which gives the divergent harmonic series ... so again you wouldn't follow the chart. Sometimes you're given ∑ and it's obviously alternating with terms which decrease to zero, so you use the alternating series test directly and don't bother wandering through the chart to find this test. Sometimes ...

S: Yeah, I get the idea. You said all this would come in handy in the next lecture, so can we forge ahead?

LECTURE 7

TAYLOR POLYNOMIALS and TAYLOR SERIES

Recall the technique for finding a polynomial approximation to a given function:

Given, say, y = f(x) = ex, we want a polynomial of degree 5 which approximates this function near x = 0. A polynomial of degree 5 has six constants: P(x) = a0 \+ a1x + a2x2 \+ a3x3 \+ a4x4 \+ a5x5 and we can impose six conditions on this quintic polynomial so as to get six equations in the six unknowns a0, a1, a2, ..., a5. The conditions we impose are that the polynomial should have the same value as f(x), at x = 0, and the same first derivative and the same second derivative and so on ... ending with the same fifth derivative. (That's six conditions: count them!) This gives:

where we use the notation P(3) to mean the 3rd derivative, P'''(x), and so on.

These six equations become:

and that means we should choose the coefficients of our polynomial as a0 = 1, a2 = , a3 = , a4 = , a5 = . The polynomial is then

P(x) = 1 + x + + + + .

It's easy to check that this polynomial does indeed agree in value and derivatives (up to the 5th) with f(x) = ex.

We plot both y = P(x) and y = f(x) = ex to see how good the approximation is. In fact, we'll call the 5th degree polynomial approximation P5(x) since it's clear we could find a polynomial of degree 6 or 7 etc. and we'd expect these to be even better approximations. |

---|---

The various polynomials which approximate a given function f(x) near, say, x = a, are given by:

TAYLOR POLYNOMIALS

P1(x) = f(a) + f'(a) (x - a) ... which is just the tangent line at x = a

P2(x) = f(a) + f'(a) (x - a) + f''(a) (x - a)2

P3(x) = f(a) + f'(a) (x - a) + f''(a) (x - a)2 + f'''(a) (x - a)3

P4(x) = f(a) + f'(a) (x - a) + f''(a) (x - a)2 + f'''(a) (x - a)3 \+ f(4)(a) (x - a)4

It's clear how they continue:

• P5(x) is a polynomial of degree 5 and ends with a term in (x - a)5

• The term which has (x - a)k also has the kth derivative of f, evaluated at x = a

• This term which has (x - a)k also has a factor .

• THE THINGS WHICH CHANGE FROM ONE FUNCTION TO ANOTHER ARE THE DERIVATIVES!!!!

• Every Taylor polynomial has the form: **( )** \+ **( )** (x-a) + **( )** 2 \+ **( )** 3 \+ **( )** 4 \+ ... where you just compute the derivatives of the function f(x), at x = a, and stick them into the appropriate **( )**.

**Example:** Calculate the Taylor polynomials, about x = 0, for f(x) = sin x.

**Solution:** We put a = 0 into the general formula (so the powers of (x-a) just become powers of x) and get the scheme:

then we compute the various derivatives of f(x) = sin x at x = 0 and insert them into **( )**. It's perhaps easiest to construct a table of f and its derivatives, and their value at x = a = 0:

n 0 1 2 3 4 5 6 7 etc. etc.

f(n)(x) sin x cos x -sin x - cos x sin x cos x -sin x -cos x etc. etc.

f(n)(0) 0 1 0 -1 0 1 0 -1 etc. etc.

Conveniently, the various derivative values just repeat the pattern: 0, 1, 0, -1, 0, 1, 0, -1, etc. etc. so the Taylor polynomials (at x = 0) are particularly easy to construct. P7(x), for example, is:

P7(x) = **(0)** \+ **(1)** x + **(0)** \+ **(-1)** \+ **(0)** \+ **(1)** + **(0)** + **(-1)** +

or P7(x) = x - + - . Also, we have P1(x) = x, P2(x) = x, P3(x) = x - , P4(x) = x - ,

S: Whoa! P2 isn't even a polynomial of degree two! You said these were polynomials ...

P: Well, that's quite true. We can't call P2(x) the Taylor polynomial of degree 2 because it's only of degree 1 ... so let's just call it P2(x), the "second Taylor polynomial", okay? We won't say it's a 2nd degree polynomial. Similarly, P4(x) isn't a 4th degree polynomial since it's the same as P3(x) and you'd find that P6(x) is the same as P5(x) and P8(x) = P7(x) and so on.

S: So what's going on here?

P: If we insist that the second derivative of P2(x) is the same as the second derivative of f(x) = sin x, at x = 0, then the coefficient of x2 in P2(x) will be zero (because the 2nd derivative of sin x is zero, at x = 0) ... so there won't be a term in x2 ... so P2(x) will just be 1st degree. In other words, there just isn't a 2nd degree polynomial which has a zero 2nd derivative at x = 0. See?

S: I'll take your word for it. Can you plot a few? How good are they, these Pn approximations?

  | P: They're quite good so long as you don't stray too far from x = 0. Remember that we're matching derivatives at x = 0. Had we matched at x = 5 then the polynomials would be excellent near x = 5. But just look at those Taylor polynomials trying desperately to match the oscillations of f(x) = sin x. See? The more terms we add the better they do! Notice, too, that the polynomials eventually go off to infinity; they are, after all, polynomials and not sin x which just oscillates between +1 and -1 forever. Can you see which ones go to +∞ and which to -∞?

S: Every other one goes to +∞.

P: Why?

S: I haven't the fogg..

P: When the polynomial ends in a positive term, like P5(x) ends with + , that term takes the polynomial to +∞. When it ends with a negative term, like P7(x) ends with - , then that term takes the poly to - ∞. See? The last term eventually dominates every other term when x is large, so I can tell you right now that

---|---

P123(x) will look much like ± when x is very large, the sign depending upon whether the 123rd derivative of sin x is +1 or -1, at x = 0. Near x = 0, of course, P123(x) will look very much like sin x, following the ups and downs for quite a while before taking off to ±∞. See?

**Example:** Determine the Taylor polynomials, about x = 0, for f(x) = e-x.

**Solution:** We begin by evaluating the various derivatives at x = 0:

n 0 1 2 3 4 5 6 7 etc. etc.

f(n)(x) e-x \- e-x e-x \- e-x e-x \- e-x e-x \- e-x etc. etc.

f(n)(0) 1 -1 1 -1 1 -1 1 -1 etc. etc.

Since the polynomial is an approximation about x = **0** , we set a = **0** in the general formula:

**( )** + **( )** (x-a)+ **( )** 2+

**( )** 3+ **( )** 4\+ ...

to get

**( )** + **( )** x+ **( )** + **( )** + **( )** +... and substitute the derivatives, yielding:

**(1)** + **(-1)** x+ **(1)** + **(-1)**

+ **(1)** \+ ... or simply

1 - x + - + - + ...

If we plot the various Taylor polynomials, namely:

P1 = 1 - x, |

---|---

P2 = 1 - x + , P3 = 1 - x + - , P4 = 1 - x + - + etc. it's clear that they get closer to y = e-x, becoming better approximations not only near x = 0 but even farther from this point of approximation. However, e-x0 as x∞ and that's something a polynomial can't do, so eventually the polynomials leave the exponential.

S: And I notice they take off to either +∞ or -∞ depending on whether the Taylor polynomial ends in a positive or a negative term.

P: Very good! Now here's a curious example. Pay attention and see if you can tell that something different is happening.

**Example:** Determine the Taylor polynomials for f(x) = _ln_ x about x = 1.

**Solution:** We construct a table of derivatives at x = 1:

n 0 1 2 3 4 5

f(n)(x) _ln_ x

f(n)(1) 0 1 - 1 2! - 3! 4!

and we can see the pattern of derivatives at x = 1: 0, 1, -1!, 2!, -3!, 4!, -5!, etc. etc.

Since the polynomial is an approximation about x = **1** , we set a = **1** in the general formula:

**( )** \+ **( )** (x-a) + **( )** 2 \+ **( )** 3 \+ **( )** 4 \+ ... to get

**( )** \+ **( )** (x-1) + **( )** 2 \+ **( )** 3 \+ **( )** 4 \+ ... and substitute the derivatives to get

**(0)** \+ **(1)** (x-1) + **(-1)** 2 \+ **(2!)** 3 \+ **(-3!)** 4 \+ ... which simplifies to:

(x-1) - + - + - + ...

Then P1(x) = x-1 and P2(x) = (x-1) - and P3(x) = (x-1) - + etc.

We can plot these polynomial approximations and anticipate that they will be excellent near x = 1:

P: Okay, can you see anything different happening?

S: Nope. Looks the same to me. Use a bigger polynomial ... I mean, one of higher degree, and you get a better approx ... wait a minute! P2 (x) is better than P10(x), or am I seeing things? At least when x is bigger than about 2.5, right?

P: In this example if you want a better and better approximation to ln x at, say, x = 2.5, you wouldn't use Taylor polynomials of higher and higher degree. Cute eh? In fact, the higher the degree the worse the approximation.

S: But not at x = .5, right? I mean, it looks like P10(x) is better than P2(x) if x is close enough to x = 1, right?

P: Right. In fact the higher the degree of the Taylor the better the approximation ... but only for values of x which lie ... where?

S: Uh ... from the picture ... uh, I'd say for x between 0 and 2, maybe more. But how can you tell? I mean, do you know beforehand where the Taylor polys won't be any good? And I've been meaning to ask you ... you seem to like to find the Taylor polys by just getting something like 1 - x + x2/2 - x3/3! \+ x4/4! - ... with the DOT DOT DOT at the end, then just picking out the poly you want. But what happens if you keep all the terms. Then you'd get a polynomial that goes on forever. What then?

P: I thought you'd never ask.

LECTURE 8

INFINITE POWER SERIES

TAYLOR SERIES

If we terminate the series f(a) + f'(a) (x - a) + 2 \+ 3 \+ ... we get a Taylor polynomial for f(x) about x = a. If we don't terminate, but retain the _infinite_ series, we get what is called the **Taylor Series for f(x) about x = a**. It's a special case of a so-called POWER SERIES ... meaning a sum of an infinite number of terms, each of which is a power of (x - a).

Although one might expect that polynomials of higher and higher degree would give better and better approximations ... and the infinite series might give exact values ... that is not always the case.

f(a) + f'(a) (x - a) + 2 + 3 \+ ... =

is called the Taylor Series for f(x), about x = a

**Example** : Compute _ln_ 1.5 using Taylor polynomial approximations for f(x) = _ln_ x about x = 1.

**Solution:** We use   to illustrate. First note that the polynomials have the form

(x-1) - + - + ... and, for x = 1.5, we get (.5) - + - + ... which can be written using Sigma Notation as: Pn(.5) = . We'll use this, telling   that each polynomial is the **sum** of terms of the form (-1)k+1 , from k = 1 to k = n and we'll write this (for the benefit of the computer) as

sum((-1)^(k+1)*.5^k/k,k=1..n)

• P1:=sum((-1)^(k+1)*.5^k/k,k=1..1);

P1 := .5

• P2:=sum((-1)^(k+1)*.5^k/k,k=1..2);

P2 := .3750000000

• P3:=sum((-1)^(k+1)*.5^k/k,k=1..3);

P3 := .4166666667

• P4:=sum((-1)^(k+1)*.5^k/k,k=1..4);

P4 := .4010416667

• P5:=sum((-1)^(k+1)*.5^k/k,k=1..5);

P5 := .4072916667

• P10:=sum((-1)^(k+1)*.5^k/k,k=1..10);

P10 := .4054346479

• P15:=sum((-1)^(k+1)*.5^k/k,k=1..15);

P15 := .4054657570

• P20:=sum((-1)^(k+1)*.5^k/k,k=1..20);

P20 := .4054650929

• P50:=sum((-1)^(k+1)*.5^k/k,k=1..50);

P50 := .4054651084

• log(1.5);

.4054651081

Notice that the values of Pn(1.5) converge to the exact value: _ln_ 1.5 = .4054651081 (to 10 digits). Now we'll do the same, but at x = 2.5

**Example** : Approximate _ln_ 2.5 using Taylor polynomial approximations for f(x) = _ln_ x about x = 1.

**Solution:** We use   again:

• P1:=sum((-1)^(k+1)*1.5^k/k,k=1..1);

P1 := 1.5

• P2:=sum((-1)^(k+1)*1.5^k/k,k=1..2);

P2 := .375000000

• P3:=sum((-1)^(k+1)*1.5^k/k,k=1..3);

P3 := 1.500000000

• P4:=sum((-1)^(k+1)*1.5^k/k,k=1..4);

P4 := .234375000

• P5:=sum((-1)^(k+1)*1.5^k/k,k=1..5);

P5 := 1.753125000

• P10:=sum((-1)^(k+1)*1.5^k/k,k=1..10);

P10 := -2.403097097

• P15:=sum((-1)^(k+1)*1.5^k/k,k=1..15);

P15 := 17.95968981

• P20:=sum((-1)^(k+1)*1.5^k/k,k=1..20);

P20 := -96.82858435

• P50:=sum((-1)^(k+1)*1.5^k/k,k=1..50);

P50 := -7590011.528

• log(2.5);

.9162907319

Now the approximations become worse as the degree of the Taylor polynomial increases. Indeed, evaluating P2(x) at x = 2.5 is about the best we can do!

**S:** That's pretty lousy! I mean, what good are these polys if ...

**P:** Patience. All will become clear.

CONVERGENCE of TAYLOR SERIES

We look carefully at the infinite series we obtained above for evaluating _ln_ 1.5, namely (.5) - + - + ... and test it for convergence using the RATIO TEST. A typical term has absolute value |an| = so the ratio of absolute values of successive terms is = =  .5 and this is less than 1, so the infinite series converges absolutely. In fact, (.5) - + - + ... converges to _ln_ 1.5 = .4054651081 (to 10 digits). On the other hand, if we consider the series obtained above (presumably for evaluating _ln_ 2.5), namely

(1.5) - + - + ... we get for the ratio = =  1.5 (as n∞) which is greater than 1, so the infinite series diverges (as can be seen from the   calculations). It's not surprising, then, that the Taylor polynomials do NOT provide better and better approximations at x = 2.5, and, in fact, the polynomial values diverge.

S: Hold on ... I've got a great idea. Let's try x = 1.6, 1.7, and so on until we get to 2.5, then we'll see where it fails to converge. Good idea, eh?

P: I have a better idea. Let's just pretend we've substituted one of your values into the Taylor series, then use the RATIO test to see if it converges.

S: But how can you tell ...

P: Patience.

We substitute x = z into (x-1) - + - + ... and take the ratio of absolute values to get = = | z - 1 |  | z - 1| and we need this to be less than 1 for convergence, that is | z - 1 | < 1 or

\- 1 < z - 1 < 1 or 0 < z < 2. Conclusion? The Taylor series (x-1) - + - + ... = converges when x = z and 0 < z < 2 and diverges if z < 0 or z > 2. Or, to put it more simply:

converges if 0 < x < 2 and diverges if x < 0 or x > 2.

S: How do you know that? I mean, the bit about divergence?

P: If the limit of the ratio is greater than 1, the series will diverge. See? We substitute x = z and get the limiting value of the ratio to be | z - 1 | ... then we ask: "when is this greater than 1?" It's when | z - 1 | > 1 and we have to solve this inequality. Let's see you do it.

S: Huh? Me? Uh ... well, let's see. I want | z - 1 | > 1 so that means ... uh ... I give up.

P: I'll give you a hint. Pretend you're asking: "when is | u | > 1?" Maybe it's the z - 1 that scares you ... so what values of u will make | u | > 1?

S: That's easy. Either u > 1 or u < -1. Good, eh?

P: Now get back to the problem at hand.

S: Oh, yeah, we want | z - 1 | > 1 so that means either z - 1 > 1 or z - 1 < -1, hence either z > 2 or z < 0. Hey! That's what you got!

P: Pay attention and we'll do some more, but first let me tell you what you can expect:

A series a0 \+ a1 ( x - a) + a2 (x \- a)2 \+ a3 (x - a)3 \+ ... =

will always converge for all values of x in some interval about x = a

Depending upon the coefficients a0, a1, a2, ... any of the following possibilities may occur:

• The series converges _only_ for x = a and for no other values of x (so the interval about x = a has zero width).

• The series converges for all values of x in an interval centred on x = a, of the form a - R < x < a + R or perhaps a - R ≤ x < a + R or perhaps a - R < x ≤ a + R or perhaps a - R ≤ x ≤ a + R.

• The series converges for _every_ real number x (so the interval about x = a has infinite width).

If the series converges only for values of x lying in some interval such as a - R < x < a + R, then the number R is called the RADIUS OF CONVERGENCE of the series. For example, converges in an interval about x = 1, namely 0 < x < 2, and the _radius of convergence_ is R = 1 (since the interval 0 < x < 2 can be described by 1 - R < x < 1 + R with R = 1). We can also check that, when x = 0, the series becomes = - which is the divergent harmonic series ... so the series diverges at x = 0. However, the series does, in fact, converge at x = 2. To see this, we substitute x = 2 and get = 1 - + \- + - ... which is the convergent alternating harmonic series. Finally, when x > 2 or x < 0 the RATIO test tells us that the limiting value of the ratio exceeds 1, so the series definitely diverges. We conclude that the only values of x for which the series converges are those lying in 0 < x ≤ 2. The radius of convergence, however, is still R = 1.

S: Hey! If ln x = = (x-1) - + - + ... and if this series converges at x = 2, then it must mean that ln 2 = 1 - + - + - ... , right? I mean, the alternating harmonic series actually adds up to the number ln 2.

P: You got it.

**Note:** The Taylor series about x = 0 has another name: it's also called the

**Example:** The Maclaurin series (or Taylor series about x = 0) for each of the following functions is given. Determine its radius of convergence:

(a) f(x) = ex has Taylor series 1 + x + + + ... =

(b) f(x) = _ln_ (1 + x) has Taylor series x - + - + - ... =

(c) f(x) = sin x has Taylor series x - + - + \- ... =

(d) f(x) = cos x has Taylor series 1 - + - + \- ... =

Solution:

(a) = = | x |  0 as n  ∞ for every value of x, so the series converges (absolutely) for every value of x. In fact, ex = 1 + x + + + ... for all x.

(b) = = | x |  | x | < 1 when - 1 < x < 1, and the series converges for these values of x.

In fact, as long as -1 < x ≤ 1, _ln_ (1+x) = x - + - + - ... (which also gives ln 2 = 1 - + - + - ... )

(c) The ratio of successive terms (in absolute value) is =  0 as n  ∞ so this series converges for every x. In fact, sin x = x - + - + - ... for all x.

(d) The ratio of successive terms (in absolute value) is =  0 as n  ∞ so this series converges for every x. In fact, cos x = 1 - + - + - ... for all x.

S: Hold on! For (a) and (b) you talk about , but for (c) and (d) you just say "the ratio of successive terms". Something fishy goin' on here, right?

P: Well, okay, I just wanted to avoid confusing you. You see, it's not really important, when you write the ratio as , whether the expressions you substitute for an+1 and an are really the (n+1)st and nth terms. They may be the (n+2)nd and (n+1)st. See? You need only take the ratio of successive terms, then find the limit as n∞. In particular, for the sine series, I really don't want to waste time finding out whether is the (n+1)st term or the nth term ... I just want to write down two successive terms and take their limit. See? In fact, for the ex series, namely 1 + x + + ... the term is actually the (n+1)st term. See?

S: It isn't really that hard to find out what the nth term is. I can show you how if you'd like. Just count 'em. For the sine series: x - + - + ... the 1st term has the x, the 2nd has the x3, the 3rd has the x5 ... uh, that's confusing, isn't it? I mean, what's the 100th term, for example?

P: Do you really want to know?

S: Will it be on the final exam?

**Example:** Prove that the series diverges for every value of x different from 0 (... the terms aren't numbers when x = 0!)

**Solution:** The ratio of successive terms (in absolute value) is = (n+1)  ∞ regardless of the value of x, so the ratio test says the series will diverge no matter _what_ x is substituted (for x ≠ 0, of course).

S: Wait a minute! You said that a series will always converge in some interval about x = a, didn't you?

P: I was talking about a series of the form a0 \+ a1 ( x \- a) + a2 (x - a)2 \+ a3 (x - a)3 \+ ... = and the

series doesn't have that form. In fact, a series of the form a0+a1( x - a)+a2 (x - a)2+a3 (x - a)3+... is called a POWER SERIES. It's pretty obvious that a power series will always converge at x = a. After all, every term is zero except the first ... so it definitely converges. See? The big question is: "How far can x move from 'a' and still give a convergent series?" Remember, the terms get bigger when (x - a) gets bigger, and to converge, the terms have to get small enough fast enough, as n increases. Sometimes even the tiniest deviation from x = a and the series diverges. Sometimes x can move arbitrarily far from x = a and the series will still converge. In that case, the coefficients themselves get so small so fast that even large values of (x - a) won't hinder this convergence.

S: You mean the numbers a1, a2, a3,... can get so small that even if x - a = 1,000,000,0000 the series

a0+a1( x - a)+a2 (x - a)2+a3 (x - a)3+... will still converge? That's hard to believe.

P: You haven't been paying attention. We've already seen that in earlier examples ... but I'll do another .

**Example:** Determine the Taylor series for sin x about x = and calculate its radius of convergence.

**Solution:** We construct a table of derivatives, evaluate each at x = , then substitute into

f(a) + f'(a) (x - a) + 2 \+ 3 \+ ... with a = . That is, we substitute for **( )** in:

**( )** \+ **( )** (x - ) + **( )** 2 \+ **( )** 3 \+ **( )** 4 \+ ...

(It's important to note that every Taylor series "about x = " will have this form. It's only the derivatives of the function that change from one series to another!)

n 0 1 2 3 4 5 6 7 etc. etc.

f(n)(x) sin x cos x - sin x - cos x sin x cos x -sin x -cos x etc. etc.

f(n)() 1 0 -1 0 1 0 -1 0 etc. etc.

These clearly repeat the pattern: 1 0 -1 0 1 0 -1 0 and so on, so we just substitute and get:

**(1)** \+ **(0)** (x - ) + **(-1)** 2 \+ **(0)** 3 \+ **(1)** 4 \+ ... or

1 - 2 \+ 4 \- + ... and if we take the ratio of successive terms (in absolute value) we get:

= 2  0 as n∞ _regardless_ of the value of x; the series converges for all x.

In fact, sin x = 1 - 2 \+ 4 \- + ... for all x. You may recognize this series. Compare it to the series previously obtained for cos x, namely 1 - + - + ... so we conclude that sin x = cos (x - ) which is a familiar trig identity!

S: Okay, so it converges for all x ... but you asked for the radius of convergence. What is it?

P: It's R = ∞. Sounds funny, eh? If a series converges for all x we say it has an infinite radius of convergence.

S: That's not so funny as the name: "radius" of convergence. I don't see any circles around, so how come "radius"?

P: Well, for the general power series a0 \+ a1 ( x - a) + a2 (x \- a)2 \+ a3 (x - a)3 \+ ... which might be some Taylor series about x = a, if the radius of convergence is R, then we imagine x = a as being the centre of a circle of radius R. All values of x inside this circle will yield a convergent series. See?

S: A picture is worth a thousand words. Remember?

P: Okay, here's the picture ===>>>

Happy?

S: Yeah ... happy. |

---|---

LECTURE 9

MORE ON SERIES

Estimating the Sum of an Alternating Taylor Series

We've already seen that an alternating series a1 \- a2 \+ a3 \- a4 \+ - ... converges if the terms an _decrease to zero_. If this is the case then every successive pair of partial sums provides an upper and lower estimate of the infinite sum. By this I mean that, if S = a1 \- a2 \+ a3 \- a4 \+ - ... is the sum of this convergent series, then S ≤ a1 and S ≥ a1 \- a2 and S ≤ a1 \- a2 +a3 and so on. To remind ourselves of this we sketch, again, the partial sums of such a series:

Note that the partial sums converge to some limiting value, the SUM of the infinite series ... and we're calling this S. Note, too, that every time we subtract a term the partial sum is _less_ than S and every time we add a term the partial sum is _greater_ than S. That means we can stop summing terms whenever we get tired and look at the last two partial sums we've obtained. The sum of the _infinite_ series lies between these two numbers. In fact, we can restate this by saying **the error is less than the first neglected term** , meaning that the last partial sum we computed is nearer S than the very next term we've neglected. In the plot above, had we stopped summing after 5 terms, then a1 \- a2 \+ a3 \- a4 \+ a5 is within a6 of the infinite sum. This is particularly useful when we use a Taylor series to calculate the value of a function --- if we're lucky, the Taylor series will be alternating!

S: Can you really prove that just by looking at the graph? I mean, is that a proof that S ≤ a1 and S ≥ a1 \- a2 and S ≤ a1 \- a2 +a3 and so on? What if I had a lousy graph? What if ...

P: Okay, that's a good point. Now watch this ... the magic of brackets. I write

S = a1 \- (a2 \- a3) - (a4 \- a5) - (a6 \- a7) - \- ... See? I've just inserted brackets and the numbers inside each bracket are positive (because the terms are getting smaller so a2 ≥ a3 and a4 ≥ a5 etc.) so S is a1 minus a bunch of stuff. See? That makes S ≤ a1. In the same way I can write S = a1 \- a2 \+ (a3 \- a4) + (a5 \- a6) + + ... so S is a1 \- a2 plus a bunch of stuff. See? That makes S ≥ a1 \- a2. Nice, eh? And I can do this any number of times and show that S always lies between two successive partial sums. That's the magic of brackets.

**Example:** Calculate sin 47˚ with an error less than 10-5 = .00001 using an appropriate Taylor series.

**Solution:** Since the Taylor series for sin x, namely sin x = x - + - + ... converges for all x, we can use it to compute sin 47˚. To this end we substitute x = 47˚ in RADIANS (!!!), that is, x = 47 and get

sin 47˚ = - 3 \+ 5 \- + ... and since the series is alternating, we'll keep summing terms until we reach a term with magnitude less than 10-5 ... then we'll stop. This gives:

sin 47˚ ≈ .8203047485 - .5519829673 + .3714292726 - .2499347131 + - ... + .000008405011437

where we've had to sum 30 terms until we got to 59 = .000008405011437, a term with a magnitude less than .00001 and that means ...

S: That means Taylor series are lousy.

P: Hold on. I did this to illustrate a point.

Of course, using the Taylor series about x = 0 is pretty silly since 47˚ is a long way off and although the series will converge at x = = 47˚ is takes a lot of terms before they start to get really small. It would have been better to use the Taylor series about, say, x = = 45˚ since the various derivatives of sin x (which we'll need to determine the series) are easily computed at 45˚. So we use

f(a) + f'(a) (x - a) + 2 \+ 3 \+ ... with a = . That is, we substitute for **( )** in:

**( )** \+ **( )** (x - ) + **( )** 2 \+ **( )** 3 \+ **( )** 4 \+ ...

We'll need to compute the derivatives at π/4:

n 0 1 2 3 4 5 6 7 etc. etc.

f(n)(x) sin x cos x - sin x - cos x sin x cos x -sin x -cos x etc. etc.

f(n)() - \- - - etc. etc.

The pattern is clear, so we just substitute and get:

\+ - 2 \- 3 \+ 4 \+ 5 \- - + \+ ...

Now we substitute x = 47˚ (in RADIANS!!!) or, better, substitute x - = 2˚ (in RADIANS!!!) = = . Now is small, about , so the terms will get small quickly and we won't have to sum too many before we reach one that's less than 10-5.

S: Ha! Gotcha! Your series ain't alternating!

P: Uh ... yes, you're quite right ... uh, well, we'll just have to add two terms then subtract two terms then add two and so on ... so we can consider it alternating if we take pairs of terms, right?

S: Huh?

P: If I have a series a1 \+ a2 \- a3 \- a4 \+ a5 \+ a6 \- a7 \- a8 \+ + - - ... then I can rewrite it as

(a1 \+ a2) - (a3 \+ a4) + (a5 \+ a6) - (a7 \+ a8) + \- ... and voila! I have an alternating series! Nice, eh? That's the magic of brackets!

S: Sounds like cheating to me.

P: The only thing I have to watch for is the error. I stop summing my sine series when two terms add to less than 10-5. See? I have an alternating series where each term is the sum of two terms of the original series ... so I keep summing until ...

S: Yeah, yeah, I see.

Substituting x - = gives the "alternating" series:

\- + - + ...

= {.7071067814+.02468268300} - {.0004307940867+.000005012516806} + {.0000000437+.0000000003} ...

= 0.7317894644 - 0.0004358066 + 0.0000000440 \- +

≈ 0.73135 which is sin 47˚ with an error less than 10-5.

Note how quickly the terms decrease because of the increasing powers of ... and the helps too, and we only needed 4 terms. We can neglect the terms which add only 0.0000000440, and that's less than the specified error.

S: I have a theorem for you. Whenever you've got an n! in the denominator, the series will converge. How's that?

P: Not good. Remember that a Taylor series has the form f(a) + f'(a) (x - a) + 2 + 3 \+ ... so there's an n! in each denominator ... yet Taylor series don't always converge for all values of x. Sometimes, the derivatives of f(x) get larger even faster than the n! in the denominator, and as (x-a) gets larger, the terms might get too large ... or at least not "small enough fast enough". That poor n! has a lot to do if it wants to drag the terms to zero fast enough. Sometimes it loses the battle. That's the case with ln (1+x), remember? If x = 2 (for example), the taylor series about x = 0 diverges. However, n! often succeeds. Look at the series for ex = 1 + x + + ... + + ... This series will converge even if x = 1000 or 1,000,000 or 10100. That's because n! wins against powers of x, no matter how large x gets.

S: You mean n! is really big, right? I mean, when n gets large, then n! is much larger than 1,000,000n. I'm not sure I really believe that. I mean, 1,000,000100 is pretty big ... much bigger than 100!

P: Do you want to know how big 100! is?

S: Oh oh ... I can see this coming. Don't tell me. It's huge, right?

P: Much too large for my $4.95 calculator, so I'll ask  .

• evalf(100!);

158

.9332621544*10

S: But that's not bigger than 1,000,000100 ... I mean, 1,000,000100 = (106)100 = 10600 and that is much bigger, just like I said.

P: What that means is, when you substitute x = 1,000,000 into ex = 1 \+ x + + ... + + ... you have to take a lot of terms before they start to get small enough fast enough; but they will, believe me.

S: I hate to make a fuss about this ... but I don't believe you. I mean ...

P: Okay, let's look at 1,000,000n versus n!, as n increases. First, notice that 1,000,000n is a product of n factors, and each one is identical, namely 1,000,000. Now look at n! which is also a product of n factors ...

S: Yeah, but they're (1)(2)(3) ... and that means they're small!

P: Patience ... n! can wait around until its factors are MUCH larger. Remember, 1,000,000n just has factors of 1,000,000 but n! eventually has factors of 1,000,0001,000,000 and even larger. Do you really think that 1,000,000n can keep up? Not likely. In fact, let's compare 1,000,000n with n! by taking their ratio, After all, that's precisely what we're doing when we investigate the size of terms an = , for x = 1,000,000. Now these terms eventually decrease, meaning that n! is increasing more rapidly then 1,000,000n. If we look again at what the RATIO test is telling us, we consider = = which is less than 1 when n > x. In other words, if x = 1,000,000, the terms an will start to decrease after the 1,000,000th term because n+1 > x. See? Thereafter, the n! really goes to town. After 2,000,000 terms they're decreasing by at least = = and after 10,000,000 terms they're decreasing by at least

= and after ...

S: Okay, I get the idea. So n! gets big ... if we wait long enough. But I know something even bigger.

P: What?

S: en is bigger, because ex grows exponentially and you always said that ...

P: You can't be serious? en isn't even as large as 1,000,000n because e isn't as large as 1,000,000.

S: But what about all that explosive exponential growth stuff?

P: Remember that 1,000,000x is also an exponential function, just like ex ... and it grows faster ... but not so fast as n! because ...

S: Okay, tell me something that grows faster than n!

P: en!

S: Aha! And 1,000,000n! grows even faster than that!

P: Smart fellow. Now where were we?

In estimating the sum of an infinite series, pray that it's alternating with terms which decrease to zero. Then we can stop summing at any time and the error will be less than the first neglected term. Indeed, the sum of the infinite series will lie between the last two partial sums.

**Example** : Estimate 1 - + - + - + ...

**Solution:** We add a few terms and get: S1 = 1, S2 = 1 - = .5 (so we now know that the infinite series adds to something between .5 and 1) and S3 = 1 - + = .75 (so we know the infinite series adds to something between .5 and .75) and S4 = 1 - + - = 0.625 (so we now know that the infinite series adds to something between .625 and .75) and ...

S: Hold on! That's a geometric series and I know exactly what it adds up to ... it's ... uh, = = .

P: And does indeed lie between .625 and .75, so aren't you impressed?

S: I'd be more impressed if you didn't have to rely on the series being alternating. I mean, how many times are we going to run across an alternating series? I mean, is Mother Nature so accommodating ...

P: Good question. Let's see what we can do.

Estimating the Sum of "Other" Taylor Series

Consider the Taylor series for ex, namely 1 + x + + + ... = . If we use it to compute the number "e", by setting x = 1, we'd get the series 1 + + + + + ... and now we don't have an alternating series. However, we can still estimate the sum of the infinite series as follows:

We consider, again, what the RATIO test is telling us. If an = , then = ≤ when n≥9. That means, after 1 + + + ... + , the terms decrease by at least a factor so the entire balance of the infinite series, namely + + ... is less than which is a convergent geometric series with a = and r = so it sums to = = . Hence we can see that the sum of the infinite series lies between 1 + + + ... + and 1 + + + ... + \+ .

S: Hey! That's practically the same as saying the error is less than the first neglected term. Right? I mean, if I stop adding after 1 + + + ... + , then the infinite series adds up to something between this and 1 + + + ... + + so the error is less than which is ... uh, almost the first neglected term ... which is .

P: The first neglected term is = which is only slightly less than , so you're quite right. In fact, if you sum a series a1 \+ a2 \+ a3 -+ ... and stop with the term a99 for example, and if the remaining terms decrease by at least a factor r (according to the RATIO test), then all the remaining terms add up to something less than a100 (1 + r + r2 \+ ...) = which compares favorably with a100, the first neglected term ... if r is small. So there you have it! A method of estimating the sum of an infinite series (which satisfies the RATIO test), even if every term is positive. Nice, eh?

**Example:** Estimate the value of by using five terms of a Taylor series for f(x) = , about x = 49.

**Solution:** We construct a table of derivatives for f(x) = , evaluated at x = 49 (a place near x = 47 where we actually _know_ the value of the function and its derivatives!):

n 0 1 2 3 4

f(n)(x) x1/2 x-1/2 \- x-3/2 x-5/2 \- x-7/2

f(n)(49) 7 - -

Now we substitute these derivative values into:

f(a) + f'(a) (x - a) + 2 \+ 3 \+ ... with a = 49 . That is, we substitute for **( )** in:

**( )** \+ **( )** (x - 49 ) + **( )** 2 \+ **( )** 3 \+ **( )** 4 \+ ... and this gives:

≈ 7+ - 2 \+ 3 \- 4 (where we use only five terms) and since we want to estimate we substitute x = 47 and get:

≈ 7 - - - - . Every term is negative (!#$?*& so we don't have the simplicity of an alternating series for estimating our error), but fortunately we only wanted the five-term estimate and that's

≈ 7 - 0.14285714 - 0.00145773 -0.00002975 - 0.00000076 = 6.85565462 and although we have no error estimate (without going through the Ritual of the Ratio) we have confidence that this estimate is accurate to perhaps six or seven places of decimals (since the terms are decreasing rapidly).

S: So, what does   say?

P: I'll ask   for a bunch of digits, then to evalf the sqrt of 47:

• Digits:=75;

Digits := 75

• evalf(sqrt(47));

6.85565460040104412493587144908484896046064346100132627548510818567851711514

S: Not bad.

LECTURE 10

CURVES and PARAMETRIC EQUATIONS

Way back when, we defined a function:

_f(x) is a function if, for each value of x in some "domain", there is a single, unique value of f(x) defined_.

What this means is that the graph of y = f(x) must satisfy the "vertical line test". That is, every vertical line x = c (where c is in the domain) intersects the graph precisely once.

Sad.

How, then, do we describe a curve ===>>

In fact, how do we describe, in terms of "functions", a circle or an ellipse which clearly don't satisfy the vertical line test?

That's what we'll do now: |

---|---

We describe the location of a point (x,y) via equations such as . That means, for each value of "t" (called the "parameter"), there is a single, unique value of x and y (because cos t and sin (2t) are functions, hence provide unique values for each t). The diagram indicates the path of the point (x,y) as t changes from t = 0 through t = , , , etc. until, finally, t = = 2π and we've completed the curve. Thereafter, for t > 2π, the curve repeats itself.

For this particular curve, described by so-called PARAMETRIC EQUATIONS: x = cos ty = sin (2t), we can see that -1 ≤ x ≤ 1, - 1 ≤ y ≤ 1 so the entire curves lies in a square centred at the origin. We also see that x goes from 1 to 0 to - 1 to 0 to 1 and, at the same time, y oscillates between these same values, but twice as rapidly (because y = sin (2t) oscillates twice as rapidly as x = cos t), so that when x goes through one |

---|---

complete cycle, y goes through _two_ cycles. Further, we can easily pick out the maximum and minimum values of x and y, and the values of t where they occur. This gives us enough information to sketch the graph (although the diagram, above, was actually generated by a computer ... just so you could see the precise behaviour).

In general, parametric equations for a curve in the x-y plane are given by: x = f(t), y = g(t) where "f" and "g" are functions of the parameter "t". It's convenient to use the letter "t" as the parameter and to think of it as being the _time_. Then x = f(t), y = g(t) describes the location of a moving point at any time t. It's something like giving the latitude and longitude of an automobile moving along some race track ... or any curve.

It's important to be able to construct parametric equations for some simple curves, and we'll do that now:

**Example:** Verify that x = a cos t, y = a sin t are parametric equations for the circle x2 \+ y2 = a2. Then construct parametric equations for the ellipse + = 1.

**Solution:** If x = a cos t, y = a sin t, then x2 \+ y2 = 2 \+ 2 = a2 (cos2 t + sin2t) = a2, as required. For the ellipse we want to find functions x(t) and y(t) so that 2 \+ 2 = 1, but this would be the case if = cos t and = sin t and that gives parametric equations for an ellipse: x = a cos t, y = b sin t.

S: Are they the only parametric equations for a circle? And how would I know that x = a cos t, y = a sin t is a circle, if you didn't tell me? And what's the procedure for finding parametric equations for some curve y = f(x), and how would I ...

P: Okay, listen. First, if you're given parametric equations like x = a cos t, y = a sin t and you want to find the equation for this curve as a relation between x and y (without any t's), then you must eliminate the parameter "t". In this case it's pretty easy: x2 \+ y2 = a2: a circle. For x = a cos t, y = b sin t, you'd eliminate "t" by noting that cos2t + sin2t = 1 so you'd write 2 + 2 = cos2t + sin2t = 1: an ellipse. On the other hand, if you already have y = f(x) and want parametric equations, you can always write x = t, y = f(t). See? If you eliminate t between x = t and y = f(t) it'd give y = f(x). Nice, eh?

S: So for a circle, like maybe x2 \+ y2 = a2, I'd just write x = t and y = ... uh, y = what?

P: If you tried to solve for y, from t2 \+ y2 = a2, you'd get y = ±and you wouldn't have a function of t because there are two y-values, so that's no good. In fact, what I said was, if you had y = f(x), a function of x, and you wanted parametric equations, then you could let x = t and y = f(t). But you have to start with y = f(x), not something like x2 +y2 = a2.

S: Okay, suppose I take y = instead. In case you didn't notice, that's the upper half of the circle, and it is a function. Now I'd let x = t and get y = and, according to you, they'd be parametric equations for this circle, right? But they aren't the same as x = a cos t, y = a sin t! How do like that!

P: First off, if you let "t" take on any value, then x = a cos t,

y = a sin t gives a point which runs around the entire circle, not just the upper half. However, if you restrict t to lie in, say, 0 ≤ t ≤ π, then x = a cos t, y = a sin t do give you parametric equations for the upper half of the circle, and yes, they are different than the parametric equations x = t and y = ... and that's the answer to your first question. |

---|---

S: Huh? Graph of x = cos t, y = sin t, 0≤t≤π

P: You asked if they were the only parametric equations for a circle, and the answer is NO. There are lots of parametric equations for a curve given by, say, y = f(x): x = t, y = f(t) is just one of them. You could also let x = t3 and y = f(t3) since eliminating "t" still gives y = f(x).

S: Hey! Let me do one. I'd let ...

P: Wait, let me give you a particular curve. Try to find parametric equations for the parabola y = x2.

S: Okay, I'd let x = t so y = t2. How's that?

P: Great! Give me another.

S: Okay, I could also let x = t2 and then y = x2 = t4. I mean, x = t2, y = t4 are also parametric equations for y = x2 and I could also let x = ...

P: One minute. When you let x = t2 you're also inadvertently restricting x to be positive. After all, regardless of the value of the parameter "t", x = t2 will give you a positive value of x. In fact, if you let t run from -∞ to +∞, where will the point (x,y) move?

S: I haven't the foggiest. Uh, wait ... the point will move along the parabola y = x2, right?

P: Yes, but it can never get to the left-half of this parabola because that requires a negative x which you can't get from x = t2.

S: A picture is worth ...

P: Okay, here's the picture ===>>>

See? As x runs from t = -∞ to t = +∞, the point (x,y) runs down the parabola to the origin (which is reached when t = 0, because then x = t2 and y = t4 are both 0), then runs back along the same path again, back into the first quadrant. That's why I picked x = t3, y = f(t3) as parametric equations for y = f(x) because now we'd get the whole |

---|---

curve, because as t runs from -∞ to +∞, so does x. Graph of x = t2, y = t4

S: Okay, let me try x = sin t and y = x2 = sin2t. That's a parabola too, right?

P: Yes, but not all of it because x = sin t can only lie between -1 and +1, so you'd get only that piece of the parabola.

Before you ask, here's a picture ====>>

See? As t varies from -∞ to +∞, x = sin t just moves back and forth from -1 to 1 to -1 to 1 and so on, while y = sin2t moves back and forth from 0 to 1. |

---|---

Graph of x = sin t, y = sin2t

The curve

has parametric equations

When I was young and foolish, I wrote a computer program which would plot the graph of y = f(x). The program asked the user to type in f(x) and the domain of x, and then it plotted y = f(x). The program could not plot circles because there was no way my program would accept y = ±as the function ... smart program! (It's _not_ a function.) Then I ate my smart pills and changed the program to plot curves given by _parametric_ equations. The user was asked to type in _two_ functions, f(t) and g(t), and the domain of t, and the program plotted x = f(t), y = g(t). NOW I could plot circles (as well as other wonderful curves) ... smart program!

If you wanted to plot y = sin x, you typed in the two functions t and sin t and the program plotted x = t, y = sin t. In general, for y = f(x), you gave the program x = t and y = f(t).

But there was a bonus in this modified program. If you wanted to plot r = f(), a POLAR curve, you just remembered that x = r cos , y = r sin , hence x = f() cos  and y = f() sin , so you just typed in the two functions f(t) cos t and f(t) sin t and the program plotted x = f(t) cos t, y = f(t) sin t. Of course, it didn't matter that the name of the parameter was "t" instead of "".

The POLAR curve r = f() has parametric equations x = f() cos , y = f() sin 

Sketching curves given by parametric equations is sometimes more difficult than curves described by y = f(x) --- and sometimes less difficult. We'll do some examples:

Plotting Parametric Curves

**Example:** Sketch x = t2 \- 1, y = t3.

**Solution:** We could make a table of values, like so:

t -2 -1 0 1 2 3 etc.

x 3 0 -1 0 3 8 etc.

y 8 1 0 1 8 27 etc.

and then plot points, one-by-one (perhaps keeping track of what points go with what t-value by writing the t-value beside the point).

We'd get something like this ===>>>

We might also make tiny sketches of x = t2 \- 1 and y = t3 like so:

 |

---|---

Then we'd note that as t goes from -∞ to +∞, x begins at +∞ and decreases to -1 (at t = 0) then increases again to +∞. In the meantime, y starts at -∞, increases to 0 (at t = 0) and continues to +∞. The point (x,y) then begins in the 4th quadrant at (+∞,-∞) then moves up and to the left (x decreasing, y increasing) until it hits (-1,0) at t=0, then x increases again as does y, and the curve heads into the 1st quadrant, heading for (+∞,+∞). That's enough to get a sketch of the curve. In fact, it isn't even necessary to plot x = t2-1, y = t3. If you can visualize what they look like, that's enough.

Now let's get my computer program to provide an accurate plot (... well, the accuracy is more a function of what my printer can provide):

Surprise! The curve has a cusp at t = 0!! Something we might not have expected, and probably wouldn't have caught had we been satisfied with "sketches" based upon a few points, or even arguments like "x goes West and y goes North, then x goes East and y continues North". What we really need is ...

S: Wait! Don't tell me! We need to ... uh, we need ... uh, what do we need?

P: If you had to sketch y = f(x) ... forget about parametric equations for the moment ... and you wanted to catch any "cusps" where the curve had a sharp point (not a "corner", but a "point") then you'd look for places where = ∞, but y wasn't. Right?

S: Huh?

P: Let's look at a typical cusp ===>>

See? The derivative becomes infinite, but y is finite. On the other hand, if y becomes infinite as well, you wouldn't have a cusp, you'd have a vertical asymptote.

S: What about if the cusp is horizontal. I mean, what if the point is pointing East or West?

P: The y = f(x) wouldn't be a function, would it? I just want to consider what happens when a cusp occurs, for a "function" y = f(x). See? To identify a cusp |

---|---

we have to consider the derivative and that brings us to the next step in parametric equations: derivatives!

S: But can't we do some more plotting, or sketching, or whatever?

P: Derivatives will help, just as they did for y = f(x).

Slope of the Tangent Line to a Parametric Curve

Consider (again) x = t2-1, y = t3. Note that = 2t tells us when x is increasing (when 2t>0) or decreasing (when 2t<0). Also, = 3t2 is never negative so y never decreases. I say "when x is increasing" because I like to think of the parameter "t" as time. In fact, if x is measured in metres and t in seconds, is measured in metres/second, a velocity, and it's the velocity with which the point (x,y) moves left-right (or East-West). On the other hand, is the up-down (or North-South) velocity. That's nice. For those of you who are familiar with **vectors** , the velocity vector **V** of the moving point (x,y) has two components, an x-component , and a y-component and we can denote this by writing **V** = . Since the point moves in the direction of the curve itself (else it wouldn't follow the curve!), the velocity vector must point in the direction of the curve itself. That means that the vector **V** is tangent to the curve at every instant of time "t", hence must point along the tangent line. That means that we can find the slope of this tangent line just by finding the slope of **V** and that means ...

S: Whoa! A picture is worth ...

P: Here's the picture ===>>>

Notice that the vector **V** has a slope of = and that's how we can find the slope of the tangent line to a curve given parametrically.

 |

---|---

The slope of the tangent line to x = f(t), y = g(t) is = =

**Example:** Determine the slope of the tangent to each of the following at the indicated value of the parameter. In each case, sketch the curve and the velocity vector at the indicated point.

(a) x = cos t, y = sin t at t = (b) x = t2-1, y = t3 at t = 2.

Solution:

(a) = = - cot t = 0 at t = .

Note that, at t = , = - sin t = - 1 meaning the point is moving left and = cos t = 0 meaning the point is NOT moving up or down. At the point (x,y) = (cos t, sin t) = (0,1), when t = , the velocity vector is **V** = = [-1, 0] as shown, |

---|---

(b) = = t = 3 at t = 2.

When t = 2, we're at the point (22-1,23) = (3,8) and the velocity vector has components = 2t = 4 (to the right) and = 3t2 = 12 (upward) ... as shown, i.e. **V** = when t = 2. |

---|---

S: Hey, what about that cusp we were talking about?

P: Oh, yes ... let's discuss that. For x = t2-1, y = t3, we have the slope = t as obtained above. That means that, at t = 0, the slope is zero so the point (x,y) is travelling horizontally.

S: Hold on! Since = 2t = 0, it isn't travelling horizontally at all. In fact, = 3t2 = 0 too, so it isn't even travelling up or down. In fact, the point isn't even moving when t = 0! That's amazing ... isn't it?

P: You surprise me ... that's quite clever of you, and you're quite right, the point comes to a momentary halt at t = 0 before it does an about face and continues again in precisely the opposite direction, heading eventually into the first quadrant.

S: The opposite direction?

P: Sure, look at = 2t. It's negative for t<0 so the point is moving left, then it's 0 when the point stops moving at t = 0, then it's positive when t>0 so the point is now moving right. In the meantime, although = 3t2 is never negative it is zero momentarily, but then y continues to increase.

S: Hey! If = 0 and = 0 then how come the ratio isn't ? I mean, when you evaluated you didn't get , you got 0 ... so how come?

P: Another good comment. In fact, I really need to find the limit of as t0 and that is 0. You see, although

and are both approaching zero, is approaching much faster because it's 3t2 and is only 2t and, when t is small, 3t2 is much smaller than 2t. See?

S: No.

P: Then forget it ... but remember this: you get much more information by considering and separately than by simply dividing to get . You see, if both and were negative, their ratio would be positive and you'd get a positive slope alright ... but the point is actually moving down and to the left along the curve, not up and to the right. See?

S: No.

P: Then forget it ... and I take back what I said about your being clever.

**Example:** Calculate each of the following:

(a) The equation of the tangent line to x = et, y = cos t at the point (1,1).

(b) at x = 1 if x = t3, y = arctan t.

Solution:

(a) If x = et = 1 then t = 0 and we check to see that y = cos 0 = 1 so t = 0 does place us at the point (1,1). Now = = = 0 when t = 0. Hence the tangent line is: = 0 or y = 1.

(b) If x = t3 = 1, then t = 1 and y = arctan 1 = so we're at the point (1,). Further, = = = so the tangent line (using the _point-slope_ form) is = or y = - + .

LECTURE 11

MORE ON PARAMETRIC REPRESENTATION OF CURVES

the Tangent and Position Vectors

For a curve described parametrically, x = f(t), y = g(t), the tangent vector is **V** = = . If we think of "t" as being time and "x" and "y" as being distances, then **V** can be interpreted as the velocity vector for a point moving along this curve. In fact, the speed with which the point (x,y) moves can now be obtained: it's the magnitude (or length) of the vector **V** , namely . Now that we're into vectors, we can also describe the location of the point (x,y) as a "position vector": **R** = and can think of **R** as a vector which begins at (0,0) and extends to the point (x,y). The length of **R** is, of course, . We recognize as the distance from (x,y) to the origin (what else?).

Let's make all this more prominent:

For a point (x,y) moving along the curve: x = f(t), y = g(t), we have:

R = = the position vector, V = = velocity vector, = the speed

  | In the diagram we illustrate a point located at (x(t),y(t)), at time t, and its position vector **R** = and velocity vector **V** = which points along the tangent line, in the direction of motion of the point. If, by the phrase "the derivative of a vector **R** " we mean another vector whose components are the derivatives of the components of **R** , then we can write: which is very

---|---

nice indeed. In fact, we can actually differentiate **V** to get the acceleration, **A** = **V**. In fact, we can do all this in 3

dimensions too, defining a point (x,y,z) in 3-dimensional Cartesian (or rectangular) coordinates which moves along a curve given parametrically by x = f(t), y = g(t), z = h(t), then introducing the point's position vector **R** = and its velocity vector **V** = and even its acceleration vector **A** = **V** = **R** = and its distance |

---|---

from the origin: the length of **R** , namely , and its speed, the length of **V** , namely

and so on. In fact, it's now clear that the expression for the "slope of the tangent line" we obtained earlier for curves in the x-y plane, namely , is really not something that easily generalizes to 3- dimensions ... and that's one reason for keeping your eye on each component and rather than on their ratio. Indeed, it's not even clear what we would mean by the phrase "slope of the tangent line" for curves in 3-dimensions. (We'll discuss this very problem in later lectures.) What is clear, is that these expressions are just as easily written for 10 dimensions! A point with rectangular coordinates (x1, x2, ..., x10) has position vector **R** = and if x1, x2, ..., x10 change with time "t" ... describing a curve in 10-dimensional space, given "parametrically" by, say x1 = f1(t), x2 = f2(t), ..., x10 = f10(t), then it has velocity **V** = which is a vector which points in a direction tangent to the curve, and the speed would be the length of **V** , namely ... well, you get the idea, so let's get back to curves in the x-y plane.

S: Do you remember way back when ... in the previous course ... you were talking about a car moving along a highway and you said you couldn't complete the problem until you had parametric equations to work with ... remember?

P: Yes, vaguely ... let's do that now. It's in your notes. Tell me about it.

S: A car moves south-west along a highway described by the curve y = x2 (where the x-axis points east-west and the y-axis north-south). Its headlights illuminate a fence which lies along the x-axis. Investigate the speed with which the point of light moves along the fence. And you said:

The headlight beam is tangent to the curve y = x2 and will strike the fence (i.e. will intersect the x-axis) at the x-intercept of the tangent line. So we'll pick some point of the curve then we'll find the equation of the tangent line at this point, then we'll find the x-intercept, then we'll find of this x-intercept. Let the point on the curve be (x1,y1) where y1 = x12. Then the tangent line |

---|---

equation is = slope of tangent line = 2x1 (since = 2x if y = x2). The tangent line intersects the x-axis when y = 0 so we solve for x = x1 \- . Plug in y1 = x12 and get the x-intercept as x = and if we take of this we get = so the speed with which the light travels along the fence is always half the speed with which the car moves west. Then I said "Suppose the car is moving at 100 km/hour? Then how fast is the light moving?" So? How fast?

P: Okay, we need to parameterize the equation y = x2. Although I had earlier used x1 and y1 to represent the point on the curve, I hate to get things cluttered with those subscripts, so now we'll use x and y for the point on the curve. Okay, for parametric equations we could say x = t, y = t2 but if x and y are measured in km and t in hours, then = 1 km/hour and that doesn't change and there's nothing we can do about that. We clearly need a parametric description, x = f(t), y = g(t), so that the speed is constant at 100 km/hour. And what's the expression for the speed?

S: Uh ... .

P: Right! So we need to find functions x(t) and y(t) ... can I call them that? Thanks. I hate to use f(t) and g(t). I'd much prefer ...

S: Don't digress.

P: Okay, we need to find two functions, x(t) and y(t), and these two functions must be such that

(1) = 100 km/hour, and

(2) the point (x(t),y(t)) is on the curve y = x2.

To do this, we use (2): y = x2, so that = = 2x so we substitute into (1) to get

= 100 or (factoring out the and taking its root) = 100 so that the function x(t) must satisfy this differential equation. After solving it, we can find y(t) since it's just the square of x(t). Do you remember how to solve such a DE?

S: No problem. It's separable so I just collect everything involving x together with the dx and stick everything else on the other side ... uh, that gives me dx = 100 dt so then I integrate and get = 100t + C and ... uh, can I do that one?

P: No, but I'll let you look it up in a table of integrals. Here's one.

= + a arcsin = + arcsin

= ± _ln_ = - _ln_ | x +

S: Okay ... uh, it's not there! What kind of table ... ?

P: Here it is: = ± ln

S: Are you kidding? That's not the same integral! I mean ...

P: Pay attention. Your integral is and you can easily make it look like the tabulated integral by letting

u = 2x so du = dx = 2 dx then you'd get = where I used the integral in the tables with a = 1 and the + sign and, of course, calling the variable "u" instead of "x", so I'd get = + ln | 2x + | replacing u by 2x. Okay, carry on.

S: Huh? Oh, yeah ... I just solved the DE and it's + ln | 2x + | = 100 t + C. Wow! Is that the curve? I mean, wow!

P: No, that's not the curve. The curve is y = x2. What you've found is how x must change with time in order that the speed, along this parabolic highway, is constant at 100 km/hour. Since we only asked for how rapidly the headlight beam is moving along the fence, and since we know it moves at and we know that = 100 , then the speed along the fence is and we have the speed of the headlight.

S: We do?

P: Sure! It changes with time, of course, but if you want to know the speed when x = 1 km, it's km/hour, and so on. If you know the position of the car, meaning you know x, then you know the speed of the beam along the fence. See? If you want to know the speed at some particular time, you'd plug t into + ln | 2x + | = 100 t + C and solve for x, but then you'd first have to know the value of C and that means you'd have to give me one piece of information about the location of the car at some time, say at t = 0, so I could find C. See?

S: I wish I hadn't brought the subject up.

P: Do you see anything unusual about the speed of the beam: ?

S: When x = 0 the car has reached the fence and the beam speed is ... uh, 50 which is half the car speed, just like it should be. Good, eh?

P: Yes, good, but do you see anything else ... unusual ... about this beam speed?

S: Nope.

P: It's positive, meaning ...

S: It's moving to the right, right?

P: Right! And since that's not possible, we must ... uh, you must have made a mistake. Where's your mistake?

S: Huh?

P: You had = 100 then you factored the , took the root and got = 100 when you should have taken the root of 2 as - giving - = 100 meaning < 0 so the car was moving West. See? Then you'd have < 0 too. See?

S: I don't remember doing that. Hey! YOU did that!

P: Hmmm ... let's go on.

Length of a Curve

P: Now that we've returned to the car on the highway, we can comment that the speedometer of the car measures ... what?

S: The speed, and that's .

P: And the odometer? What does that measure?

S: The mileage. I mean, how far you've gone ... the distance.

P: And are the speed and distance travelled ... are they related?

S: Uh ... if s is the distance, then v = , right?

P: Right! The distance travelled is s, shown on the odometer. The speed is v, shown on the speedometer, and v = so the speedometer reading is actually the derivative of the odometer reading.

S: Wow! My car does that?

P: Sure, every car has taken this course and ...

S: Don't digress.

P: Okay, pay attention:

If "s" measures the distance travelled along some curve, given parametrically by x = x(t), y = y(t), and v is the speed, namely the length of the vector **V** = , then v == . This actually allows us to compute the distance travelled if x(t) and y(t) are known (meaning the position, at any time t), since we'd know and we could then integrate to find s.

**Example:** A point moves according to x(t) = 10 cos t, y(t) =10 sin t, where "x" and "y" are distances, in km, and "t" is the time, in hours. How far does it move during the time interval from t = 0 to t = 5?

**Solution:** = = = = 10 km/hour which gives a simple DE for s: solving = 10 yields: s(t) = 10t + C. But s = 0 when t = 0 so 0 = 0 + C, hence C = 0 and s = 10 t km, so after 5 hours, the point moves 50 km ... OR, we could have written, simply, s = = = 50 km.

**Note:** Since = , we can find the length of the curve, from t = a to t = b, from

s =

**Example:** A point moves according to x(t) = t, y(t) = , where "x" and "y" are in metres, and "t" is in seconds. How far does it move during the first second?

**Solution:** s = = =

= = = = - = metres.

S: Hey! That's pretty tricky! I mean, you rearrange and get a perfect square and take the square root and ...

P: Calculus instructors stay awake nights thinking of such examples. Pay attention.

LECTURE 12

SOME APPLICATIONS

Polar Curves, revisited

We saw that the equation of a polar curve, r = f(), can be put into parametric form by writing

x = f() cos , y = f() sin  where the parameter  is in fact the angle . Because of this we can use parametric techniques for finding , the slope of the tangent line to a polar curve and the length "s" of a polar curve.

Recall the magic formulas for a parametric curve given by: x = x(t), y = y(t):

For the slope of the tangent to our POLAR curve x = f() cos , y = f() sin (where the parameter is called , not t) we use:

= = which can also be written

(putting f() = r and f'() = ).

For the length of a POLAR curve we first compute . Putting

= f()(-sin ) + f'() cos  and = f() cos  \+ f'() sin  and squaring and adding we note (surprise!) that there is some cancellation and we can use cos2 \+ sin2 = 1 and we get, simply = so the length of the polar curve, from  =  to  =  is given by:

s = or

**Example:** For the polar curve r = sin (2), determine the slope at  = . Also, express as a definite integral the length of the curve from  = 0 to  = .

**Solution:** We have r = sin (2) so x = r cos  = sin (2) cos  and y = r sin  = sin (2) sin , hence

= - sin (2) sin  \+ 2 cos (2) cos  and = sin (2) cos  \+ 2 cos (2) sin  hence the slope is

= = = = - 1 when  = .

Further, s = = is the required length of the curve (also called the "arc length").

S: So why don't you evaluate the integral?

P: I can't ... at least not in terms of known functions.

S: So what would you do if you really wanted an answer, I mean, an actual number?

P: I'd plot a graph of and calculate the area under the curve from  = 0 to  = . For that I could use rectangles ... a Riemann SUM, remember those? Or I could use other methods which are more efficient. People have spent a lifetime finding clever ways of evaluating areas or definite integrals, approximately, to any desired degree of accuracy.

S: For example?

P: Well, suppose I wanted the area under y = f(x) from x = a to x = b. I could pick a bunch of points from "a" to "b", evaluate f(x) at each, join all the points on the curve and find the area under this approximation to the curve.

S: Picture?

P: Okay, here's a picture ===>>>

I've divided the interval into 7 subintervals; I'll let h be the width of each. I then calculate the y-values for the 8 points on the curve; I'll call them y1 , y2 , ... y8. The area under the zig-zag approximation is the sum of a bunch of trapezoids and I have a formula for that, namely: (width) (average height) so the trapezoids would have a total area of

\+ + + ... + or simply |

---|---

and it's clear that I could get better and better approximations by choosing more and more points. See? It's called the TRAPEZOIDAL RULE.

S: Let's see you do it for

P: For n subintervals, if I call the sum of the trapezoidal areas A(n), then   gives the following:

A(3) = 2.432509699

A(5) = 2.422619019

A(7) = 2.422145164

A(10) = 2.422111352

A(20) = 2.422112055

A(30) = 2.422112055

See? Even a half-dozen trapezoids will give 3 decimal places. Good, eh?

the CYCLOID

Although we introduced parametric equations in order to describe a curve which doesn't satisfy the vertical line test (such as a circle) it is often the case that curves are more easily defined in terms of parametric equations than in the form y = f(x), even in cases where the curve does satisfy the vertical line test (meaning that the curve does represent a function). An example of this is the CYCLOID.

Consider a flashlight attached to the rim of a bicycle wheel. Turn out the lights, roll the wheel and watch the moving light trace out a curve: the cycloid. We'll find the equation of this curve:

We begin with the flashlight located at the bottom of the wheel which we take to be our origin (x=0, y=0), and we roll the wheel (assumed to have a radius R) along the positive x-axis. After having turned through an angle  the flashlight is located at some point P(x,y) as shown and we can read off the value of x and y from the diagram, in terms of .

x = OA - PQ = arclength - R sin (π - ) = R \- R sin  = R ( \- sin )

y = R + R cos (π - ) = R - R cos  = R (1 - cos )

Here we've used sin (π-) = sin π cos  \- cos π sin  = sin  and cos (π-) = cos π cos  \+ sin π sin  = - cos  as well as the fact that the distance 0A is the same as the length of the arc of the circle from A to P which, of course, is R ( in RADIANS!!!)

S: How did you know that? I mean, the length of the arc ...

P: Well, just trust me.

S: Are you kidding?

P: Okay, it's because I've assumed the wheel rolls without slipping on the ground ... uh, the x-axis. You see, if it slips then it could rotate without the wheel even moving to the right ... it just slips as though the x-axis were a sheet of ice, and the angle  could be very large because the wheel is rotating but it isn't going anywhere so ...

S: Are you kidding? That's a proof?

P: Well, no, but if it rolls without slipping then ... uh, the arclength is the same as 0A.

S: A snow job, that's what you're giving me. I'm just supposed to swallow that and ...

P: Look, suppose we wrapped a tape around the circumference of the wheel ... sticky tape... so it stuck to the ground and unwound from the wheel as it rotated. Then, after rolling to the position shown in the diagram the ground from 0 to A would have this tape and the arc of the wheel from A to P wouldn't. See? It's the same length of tape. See?

S: No. Besides, what good are cycloids, or are they just playthings for mathematicians to illustrate some parametric stuff.

P: Have I told you the story of the Brachistochrone?

S: Please don't.

 | P: In 1696, Johann Bernoulli posed the following problem to the mathematicians of the world. Two points P and Q are joined by a wire and a bead is allowed to slide along the wire from P to Q without friction. (Imagine the bead as having a small hole in it so it can slide without falling off.) Now the problem: what is the shape of the wire if the time taken is to be a minimum? Newton solved the problem in a day or so, and so did Leibniz, l'Hopital, Johann himself and his older brother Jakob. The shape is called the "brachistochrone". Guess what it is?

S: Golly gee ... I'd say a CYCLOID.

---|---

P: Here's something interesting: The parametric equations for the CYCLOID, x = R( \- sin ), y = R(1 - cos ) can, in fact, be solved to obtain a relation directly between x and y since cos  = 1 - means  = arccos(1 - ) ... if only we had studied the inverse cosine, which we haven't ... so x = R (\- sin ) = R (arccos(1 - ) - sin (arccos(1 \- )) but we can simplify sin(arccos(1 - )) which is just sin  because if cos  = 1 - then sin  = = so we'd get x = R and we could check a few points like putting y = 0 and getting x = R ( arccos (1) - 0) = R (0) = 0 which certainly checks and we could put y = 2R (the top of the cycloid) and get x = R ( arccos (-1) - 0) = R (π - 0) = Rπ which also checks (it's the length of the circular arc = half the circumference of the circle) and we could also check it dimensionally because if x and R are measured in metres then the right-side should be, too, and it is, because is dimensionless, and so is the angle arccos (1 - ). See?

S: zzzzz

the Straight Line

It may seem strange that we choose to find parametric equations for a straight line, but we've already seen that parametric equations for _curves_ are just as easy in 3 dimensions or 10 dimensions as in 2 dimensions, so we should be able to deduce parametric equations for 3-D lines by studying, carefully, those for 2-D lines.

We begin with the equation = m, the _point-slope_ form of a straight line (in the 2-dimensional x-y plane). To get parametric equations we could put x = t and y = y0 \+ m(x - x0) = y0 \+ m(t - x0) and be finished with it. However, it's not so easy to see what this would become in, say, 3 dimensions and one of the nice things about parametric equations is that they don't put any undue strain on x, like "you're the independent variable, remember that!" In fact, writing x = R( \- sin ), y = R(1 - cos ) it's clear that x and y share the spotlight ... one is not more or less important than the other ... the equations have a nice symmetry without leaning too heavily on one or the other variable. It's precisely this equality of expression which makes it easy to add more dimensions, so we search for another parametric representation of a straight line which preserves this equality and lack of prejudice, trying to avoid anything which identifies either x or y as being "special". In fact, it's this symmetry and lack of bias that makes x2 \+ y2 = a2 such a nice equation for a circle, as opposed to y = .

For our straight line we can keep the point (x0,y0) since it doesn't put special emphasis either x0 or y0, but we have to forget about the slope "m" since it's the derivative of y with respect to x, or and why should _the change in y_ be in the attic and _the change in x_ in the basement?

Nevertheless, we do want something which indicates the direction of our line, so we choose two angles like  and in the diagram. They are the angles that the line makes with the positive x- and y-axes.

Now pick any point (x,y) on the line through (x0,y0) and let "s" be its distance in the (,) direction. The number "s" will be our parameter! That's nice because as the parameter varies and our point moves back and forth along the line, the parameter value actually gives some useful information; it's the distance from (x0,y0) to (x,y).

We can identify two right-angled triangles and read off the relationship between x, y, ,  and s: |

---|---

x - x0 = s cos  gives x = x0 \+ s cos 

y - y0 = s cos  gives y = y0 \+ s cos 

and we have "parametric equations" for our straight line through (x0,y0) in a direction determined by angles (,): |

---|---

The line through (x0, y0) which makes angles  and  with the positive x- and y-directions is

x = x0 \+ s cos  y = y0 \+ s cos 

We make several observations:

• \+  = seems pretty obvious, so .

• The slope of the line is = = = so ... which is no surprise!

• A line in 3 dimensions, through (x0, y0, z0), making angles ,  and  with the positive x-, y- and z-axes is:

Example:

Obtain parametric equations for the line through (1,4) which makes an angle of with the positive x-axis.

Solution:

Since  = , then  = -  = so the equations are: x = 1 + s cos = 3 + s, y = 4 + s cos = 4 + s.

Example:

From the point (1,4), how far is it to the curve y = x2 in a direction given by  = ?
**Solution:** Parametric equations are x = 1 + s, y = 4 + s and since "s" measures distance from (1,4) in the (,) direction, we just substitute into y = x2 to find "s": 4 + s = 2 which simplifies to

s2 \+ (4 - 2) s - 12 = 0 so s = using the magic formula for roots of a quadratic equation. Hence s = - 3.74 and s = 3.21

S: Hey! How can a distance be negative?

P: A picture is worth ... you know. Here's the picture === >>>

See? The line through (1,4) which makes an angle of with the

+ve x-axis actually intersects the parabola y = x2 twice! Once in the

direction and once in the opposite direction ... and you know

what that means ...

S: I do? |

---|---

P: It means that "s" is negative. A positive s-value means the distance in the - direction is positive but a negative s-value means ...

S: Yeah, opposite to the - direction.

Parametric equations for curves often arise in a quite natural way:

**Example:** Determine where the line y = mx intersects the curve - = 1.

Solution:

Substituting y = mx we get - = 1 which can be solved for x = , hence y = mx = and we have the point of intersection for any value of m (except m = 1). However, if we stand back and stare at what we have, we see that x = , y = always lies on the given curve - = 1 hence provides (surprise!) parametric equations for that terrible curve. Normally, to obtain parametric equations for this curve we'd put

x = t and try to solve - = 1 for y in terms of t ... good luck!

S: So what happens when m = 1?

P: Well, let's see what happens when m is close to 1: x and y are both gargantuan. In fact, for m = .9999 say, then both x and y are large but negative so the point of intersection of y = .9999x and - = 1 lies far off in the third quadrant. When m = 1.0001, then x and y are far off in the first quadrant. It seems clear that for m = 1 the line y = x doesn't intersect - = 1 at all.

S: So what does - = 1 look like? I mean, the graph ... what's it like?

P: Pay attention.

Imagine the problem you'd have if you wanted to plot this graph directly from - = 1. You'd pick a bunch of x-values and for each you'd have to solve for y ... not an easy task. However, we now have parametric equations, so we can easily sketch x(m) = and y(m) = as functions of m and determine what happens to the point (x(m),y(m)) as m goes from -∞ to +∞. First we'll do x = , Quick&Dirty.

Note that, for m very small, x ≈ = m2 (where we ignored the "m3" compared to the "1") so we sketch that parabola. For m very large, x ≈ = - so we sketch that hyperbola. We also note the vertical asymptote at m = 1 and x = 0 (which, of course, is consistent with x ≈ - for large m).

 |

---|---

That's enough ... so we complete our sketch of x versus m. Now we do the same sort of thing with y = noticing that, for small values of m, y ≈ = m3 so we sketch that cubic. Then, for m very large, y ≈ = -1 so we sketch the line y = -1. We also note the vertical asymptote at m = 1 and that y = -1 |

---|---

(which, of course, checks with y ≈ -1 for large m). Both the Quick&Dirty approximations and the final sketch are shown below.

Now we're ready to sketch the curve represented by x = and y = . We let m begin at -∞ and increase to +∞ and note the following behavior of x and y:

• x begins at 0, increases for a while, then decreases to 0 (when m = 0), then increases to +∞ (as m1-)

• y begins at -1, increases continuously to 0 (when m = 0) and continues to increase to +∞ (as m1-)

So far, we've got the part of the curve for -∞ < m < 1. We continue ...

• x begins at -∞ (when m = 1+) and increases continuously to 0 (as m+∞)

• y begins at -∞ (when m = 1+) and increases continuously to -1 (as m+∞) |

---|---

Nothing to it!

S: Easy for you to say.

P: You must admit it's easier than trying to plot - = 1.

S: But if you gave me - = 1, how would I know to invent parametric equations by taking lines y = mx and finding the point of intersection ... and all that jazz?

P: You wouldn't ... but then I'm only trying to demonstrate that: (1) parametric equations arise under the most unlikely circumstances and (2) you should be happy when they do and (3) it's often easier to plot curves from the parametric equations and (4) ...

S: Okay, I'm impressed, but I have a question. Even if you gave me x = and y = how would I know that "m" was really the slope of some line?

P: Why do you have to know that? When you see x = a cos t, y = a sin t (where "t" is the parameter), you can eliminate t and get x2 \+ y2 = a2 so it's a circle (although it's nice if you recognized the circle directly from the parametric equations) and you don't really need to know What's the significance of "t". In fact, you could think of t as being the time (as I often do ... that's why I like to call it "t"), or you could even notice that it can be interpreted as the angle  in POLAR coordinates so you might prefer to write x = a cos , y = a sin  and see that it's nobody else but the POLAR curve r = a, a circle of course. In fact, I often wonder what's the significance of the parameter when I run across somebody's parametric equation. On the other hand, sometimes it's quite useful to know "Who's t?".

S: For example?

P: Suppose I gave you the line x = 1 + t, y = 4 + t. You might note right away that it's a straight line (I hope you would), but what's the significance of the parameter "t"?

S: I haven't the foggiest ... uh, wait, it's some distance, right?

P: Right. In fact, you can see from these parametric equations that (x-1)2 \+ (y-4)2 = 2 + 2 = t2 so "t" gives the distance from (3,4) to (x,y).

S: I have another question. What if I gave you x = 1 + 2t and y = 4 + 3t. Then is "t" still the distance?

P: Good question ... in fact, very good. Let's check it out: (x-1)2 \+ (y-4)2 = (2t)2 \+ (3t)2 = 13 t2 so the distance is actually == t, so "t" is only proportional to the distance. In other words, if you really wanted to write the parametric equations in terms of the distance from (3,4), you'd let the distance be "s" and since s = t you could change the equations to x = 1 + 2 and y = 4 + 3 , replacing "t" by and you'd be happy.

S: Sure, sure ... but I have another question. If you give me x = 1 + 2 , y = 4 + 3 then I can see that when s = 0 (or t = 0) the point is at (1,4) so the line passes through that point. But in what direction?

P: Aah, you should be able to answer that. Try it!

S: Who, me? Uh ... I'd try to find the angle it makes with the positive x-axis, right? But I haven't the foggiest idea ...

P: Pay attention. You could find the slope of the line and that's = or even since it doesn't matter what parametric equations we use. In any case, the slope is = and that's tan  so  = arctan . Or you could remember the "standard parametric equations" x = x0 \+ s cos , y = y0 \+ s cos  where "s" is the distance, then you'd see from x = 1 + 2 that cos  = .

S: Well, I'd still like to know the significance of a parameter if you give me parametric equations.

P: Remember that parametric equations in a calculus course are usually provided, free of charge. In real life you'd have to generate them yourself and you'd know what the the significance of the parameter was.

The curve x2/3 \+ y2/3 = 1 is called an Astroid (or 4-cusped hypocycloid). Let's find parametric equations for this curve. First note that it's much like the circle x2 \+ y2 = 1 except that the power is 2/3, not 2. But we can take a clue from this since x = cos t, y = sin t are parametric equations for the circle because x2 \+ y2 = (cos t)2 \+ (sin t)2 = 1. Now we just modify x = cos t, y = sin t so the sum x2/3 \+ y2/3 is "1" and that means we should take

x = (cos t)3, y = (sin t)3 because then x2/3 \+ y2/3 = (cos t)2 \+ (sin t)2= 1. Hence gives parametric equations for an Astroid.

**Example** : For the Astroid x = cos3ty = sin3t, determine:

(1) The equation of the tangent line when t = A, putting the equation into a form "symmetrical in x and y"

(2) The x- and y-intercepts of this tangent line.

**Solution:** We have the slope: = = = - tan t so that, at t = A, the tangent line is

= - tan A or y = y(A) - tan (A) (x - x(A)) = sin3A - tan (A) ( x - cos3A) which is a real mess, so we try for something "symmetrical in x and y". We put tan A = , multiply by cos A, rearrange and get:

y cos A + x sin A = cos A sin3A + sin A cos3A = sin A cos A (sin2A + cos2A) = sin A cos A which gives a nice symmetrical equation: y cos A + x sin A = sin A cos A, but we can now divide by sin A cos A and get

\+ = 1 which is magnificent! In fact, a straight line in the form + = 1 displays, for all to see, the x- and y-intercepts: they're X = a and Y = b. For our tangent line, the intercepts are X = cos A and Y = sin A.

P: See anything interesting about these intercepts?

S: Nope.

P: X2 \+ Y2 = 1. Does that say anything to you?

S: Nope.

P: It says that the distance between the intercepts is always equal to 1.

S: So?

P: That's wonderful! Don't you see? It doesn't even matter where you take your tangent line ... what value you give to t = A ... the piece of the tangent line between intercepts is always the same length. Don't you see? The distance is completely independent of A. Pick a different point on the curve and you get the same distance!

S: That's something only a mathie can get excited about. How about a picture?

P: Here's a picture ===>>>

A typical point is shown, for some value of A.

The tangent line is drawn. It intersects the axes at cos A and sin A and these change with A, of course, but the distance between them doesn't!

In fact you can try this the next time you're at the beach. Take a stick and draw an x-axis and a y-axis in the sand. Now move the stick so that one end is always on your y-axis and the other is always on your x-axis. The stick will push away the sand and what do you think you'll have ... carved right there in the sand?

 |

---|---

S: Picture, please?

P: Okay, the picture ...

But look how nice it was to have parametric equations for the Astroid. You can try it yourself, to prove that the distance between intercepts is a constant independent of where the tangent line is drawn, but try it with x2/3 \+ y2/3 = 1. Go ahead!

S: You gotta be kiddin'.

P: Okay, then answer this question: what about the distance between intercepts for the curve

x2/3 \+ y2/3 = a2/3 ?

S: It's an Astroid too, right? Uh ... I give up. Oh, wait, when the right side was 1 the distance was 1 so that means when the right-side is a2/3 the distance must be a2/3. Good, eh?

P: Terrible! You haven't learned a thing about |

---|---

dimensions! If x and y are measured in metres and x2/3 \+ y2/3 = a2/3, then "a" is also in metres, so the distance can't possibly be a2/3 which isn't in metres ...

S: I got it! The distance is "a". Am I right?

P: Yes. In fact x2/3 \+ y2/3 = a2/3 can also be written 2/3 + 2/3 = 1 so it's just like the last problem except that the variables are and so everything is just scaled by a factor "a" and that means ...

S: Yeah, I get it.

LECTURE 13

FUNCTIONS OF TWO VARIABLES

To date we've been studying calculus-in-the-plane, meaning all our functions (y = f(x) or x = f(t), y = g(t)) can be represented by curves lying in a **two** dimensional x-y plane and points are identified by **two** numbers (x, y) or (r,). But it's often the case (perhaps more often than not) that functions which occur in real life (meaning outside of calculus courses) depend upon more than a single variable.

• The temperature T at a point depends upon the location of the point and this location will, in general, be given by three coordinates (x,y,z) in some 3-dimensional space (maybe x = latitude, y = longitude, z = elevation): T = f(x,y,z). Perhaps the temperature changes with time as well, so T = f(x,y,z,t) where (x,y,z) gives the location of the point and t is the time when T is measured.

• The speed v of the water molecules in a stream depends upon where in the stream the speed is measured ... and perhaps the time when the measurement is taken: v = f(x,y,z,t).

• In manufacturing, the cost per item may depend upon how many are manufactured: C = f(N), meaning that it will cost $C per item, if N items are manufactured. However, the cost may also depend upon the time of year because of scarcity of materials or labour, so we'd have C = f(N,t) where t gives the time. (For example, t = 17 may mean 17 weeks from January 1.)

• The growth of a plant, H (measured perhaps in _cm/day_ ), may depend upon I the intensity of light provided (in _candlepower_ ), h the amount of humus in the soil ( _g/cm_ 3) and N the daily amount of nitrogen supplied ( _g/day_ ): H = f(I,h,N).

To address this problem ... a calculus for many dimensions ... we first begin with a function of just two variables which we write z = f(x,y). Since a picture is worth a thousand words we need a way to picture the behaviour of this relationship, and to make things more concrete we consider a particular real-world problem.

LEVEL CURVES

Suppose x and y were distances (in kilometres) east-west and north-south from some fixed point. That is, x = 5, y = -7 means we're 5 km east and 7 km south of the fixed point. Suppose, too, that the elevation (above sea level) at (x,y) was given by: z = x2 \+ y2 \- 2x - 4y + 100 metres. We could get some idea of the elevation at various points (x,y) by asking "Where are all the points with elevation = 96 metres?" Clearly this requires that

x2 \+ y2 \- 2x - 4y + 100 = 96 or x2 \+ y2 \- 2x - 4y = - 4 and this defines a curve in a two-dimensional x-y plane. It can be rewritten (completing the squares) as: (x-1)2 \+ (y-2)2 = 1 so the points at elevation of 96 metres are located on this circle. Indeed, were we to ask "Where are all the points with elevation = H metres?" we'd get the circle

(x-1)2 \+ (y-2)2 = H - 95, hence we could plot a variety of such curves for varying values of H.

  | The curves shown are called LEVEL CURVES (for obvious reasons) and we've all seen such "topographical maps" ... although the level curves would rarely be circles! Note that this plot of level curves gives a good deal of information concerning the function of two variables z = f(x,y) = x2 \+ y2 \- 2x - 4y + 100 and we didn't even have to move out of our comfortable 2-dimensional x-y plane. For example, the minimum elevation of 95 meters occurs at x=1, y=2 and the elevation increases in proportion to how far (x,y) is from (1,2) ... so the point (1,2) is sitting in a hole of sorts! Note that z might also be the temperature at the point (x,y), so these level curves would give the places

---|---

where the temperature was 96˚ or 97˚ or 98˚ (z is presumably measured in degrees Fahrenheit). In this case the level curves might also be called "isothermal" curves. Maybe the curves describe the locations where the air pressure is equal; then they'd be called "isobars".

In general, for a given function of two variable, z = f(x,y), the LEVEL CURVES are given by f(x,y) = C for various values of the constant C.

**Example:** Sketch the level curves for each of the following:

(a) z = x2/3 \+ y2/3 (b) z = x2 \+ 2y2 (c) z = y -

Solution:

In the sketches, we've indicated how the level curves change as C increases. For example,

z = y - = C gives a set of hyperbolas which are shifted upward as C increases. The Astroid (as we've already seen) simply grows larger as C increases. So does the ellipse.

**Example:** For z = xy, determine the rate of change of y with respect to x on the level curve through the point x = 1, y = 2.

**Solution:** Level curves are xy = C and differentiating implicitly gives xy = 0 or x + y = 0 so = - = - when

x = 1, y = 2.

(Note: the level curve through this point is xy = (1)(2) = 2.) |

---|---

**Example:** You are standing on the side of a mountain whose elevation is given by z = 95 - x2 \- y2 +2x + 4y metres, where x = 0, y = 0 is your location, so z = 95 is your elevation. Sketch the level curves in your neighbourhood and determine in what direction you should climb so as to increase your elevation most rapidly.

**Solution:** The level curves 95 - x2 \- y2 +2x + 4y = C can be written (x-1)2 \+ (y-2)2 = 100 - C so that the maximum elevation is C = 100 and it occurs at x = 1, y = 2, the "top" of the mountain, and (0,0) is on the side of the mountain. Before we consider where to climb, we sketch these curves (remembering that we are at the origin).

  | We want not only to climb in a direction of increasing C, but in the direction in which C increases most rapidly ... and a moment's thought (!) indicates that this means perpendicular to the level curve through (0,0), where we're located. Hence, we must find the slope of the level curve at (0,0) and move perpendicular to _that_ direction. To find the level curve through (0,0) ... meaning the value of C ... put x = 0, y = 0 in (x-1)2+(y-2)2 = 100 - C giving 5 = 100 - C so C = 95 (as we've already noted), hence the curve is:

(x-1)2+(y-2)2 = 5 and to obtain we simply differentiate to get

2(x-1) + 2(y-2) = 0, then put x = 0, y = 0 and get -2 - 4 = 0, then solve for = - . The negative reciprocal of this slope is the direction we want, so we head in a direction with slope = 2.

---|---

S: That's the stupidest thing I ever heard. I'd head directly toward the top of the mountain and that means directly toward (1,2) and that means in a direction with slope ... uh, the slope from (0,0) to (1,2) is = 2 and as you can plainly see it's the same and I didn't need any derivatives and ...

P: Okay, hold on. I picked a simple problem just so I could illustrate the technique. But what if the level curves were given by, say, (x-1)2 \+ 2(y-2)2 = 100 - C. Then what?

S: The place where the elevation is a maximum ... I mean, the top of my mountain ... is at x = 1, y = 2 and that's where I'd head ... same direction, same slope, namely 2.

P: Wrong! The curve through you ... I mean through (0,0), where you're standing, has (0-1)2 \+ 2(0-2)2 = 100 - C so C = 91 so it's (x-1)2 \+ 2(y-2)2 = 9 and differentiating implicitly gives 2(x-1) + 4(y-2) = 0 and at (0,0) we'd get -2 - 8 = 0 so you should head in a direction with slope perpendicular to = - , hence your best bet is to climb with slope = 4, not 2.

S: Show me a picture.

P: Look at the picture ===>>>

You're at (0,0) with elevation C = 91 and you want to head directly for the top of the mountain at (1,2) which means to get to the higher elevation C = 92 you travel farther than if you had headed perpendicular to the level curve through (0,0). See?

S: Yeah, but it means I'd have to ... uh, when I get to C = 92 I'd have to do it all over again, I mean find the tangent slope then the perpendicular slope and so on. |

---|---

P: Right! But only if you wanted to climb the steepest slope at all times. Maybe you're a mountain climber and find it exhilarating and ...

S: Are you kiddin'? It'd take me an hour just to do the calculations. I like my way better. Besides, my way, I travel in a straight line. Your way you'd travel in some weird curve which is ... uh, sort of ...

P: Always perpendicular the level curves ... just like this ==>>

And you won't believe this but I can determine that path exactly just by taking the perpendicular direction to each level curve and inventing a differential equation = something which has this slope and I'd solve the DE and I could find the best path up the ...

S: And I'd be waiting at the top when you got there. |

---|---

an Orthogonal Trajectory

Example:

If we move always in a direction perpendicular to the level curves (x-1)2 \+ 2(y-2)2 = _constant_ , starting at

x = 0, y = 0, what is the path? (Since it's orthogonal to the family of curves (x-1)2 \+ 2(y-2)2 = _constant_ , it's called an ORTHOGONAL TRAJECTORY.)

Solution:

At any point (x,y) on a level curve the slope of the tangent line is obtained by implicit differentiation: gives 2(x - 1) + 4 (y - 2) = 0 so = - . If we move perpendicular to this direction we'd want to move so our slope is the negative reciprocal, namely 2 . Hence our path would satisfy = 2 which (surprise!) is a separable differential equation. To solve, we separate the variables and integrate, giving: = 2 hence _ln_ |y - 2| = 2 _ln_ |x - 1| + C = _ln_ (x - 1)2 \+ C so |y - 2| = eC eln|x-1|2 = eC (x - 1)2 so y - 2 = ±eC (x - 1)2 or y = 2 + K(x - 1)2 where K = ±eC. The path is a parabola passing through (1,2) and, to pass through (0,0) as well, we'll need 0 = 2 + K(0 - 1)2 so K = -2. Hence out path will be:

y = 2 - 2 (x - 1)2 or y = 4x - 2x2 (and we check to see that it does pass through (0,0) and (1,2)).

S: That doesn't get you to the top any faster. In fact, my way ... which is heading straight for (1,2) ... I get there and I travel less distance. The shortest distance between (0,0) and (1,2) is a straight line! Didn't anybody ever tell you that?

P: Let's do something else.

3 Dimensional Surfaces

Although a graphical representation for z = f(x,y) can be obtained via LEVEL CURVES (which have the advantage of living in a familiar x-y plane), we can also pick x and y within the domain of the function "f", then calculate z, then plot a point (x,y,z) in a 3-dimensional rectangular space ... and we can do that for a host of values (x,y,z) and thereby generate a SURFACE. The coordinates (x,y,z) of every point on this surface would satisfy

z = f(x,y).

Some typical ...

S: Wait! You said the "domain" of "f"?

P: Of course. It's just like functions of a single variable: y = f(x). We pick x from the domain of "f" and y must have a single unique value (else "f" isn't a function) and y lies in the "range" and we call x the independent variable and y the dependent variable. For functions of two variables, say z = f(x,y), x and y are the two independent variables and z is dependent upon them and we choose (x,y) from the domain of "f" and ..

S: Okay, okay ... keep going.

There are some 3-dimensional things we've seen before:

are parametric equations for a line through (x0, y0, z0) in a direction which makes angles ,  and  with the positive x- , y- and z-axes ... and "s" gives the distance from (x0, y0, z0) to (x,y,z). In fact, distance in 2-D between (x0, y0) and (x1, y1) is given by the familiar expression and in 3-D the expression is a natural extension:

= distance between (x0, y0, z0) and (x1, y1, z1)

It may seem that equations of 3-D surfaces are something new, but we already know many such equations. In fact, if we lived in a 1-D world, like the x-axis, then x = 1 would identify a single point P. However, in 2-D, all points (x,y) which satisfy x = 1 would lie on a straight line through P, parallel to the y-axis ... because x = 1 places no restrictions on the y-values so every y-value is possible so x = 1 describes an entire line when we move from 1-D to 2-D. In fact, we get the 2-D line x = 1 by starting with the 1-D point x = 1 and sliding the point parallel to the y-axis, sweeping out the line. In fact, |

---|---

in 2-D, _two_ equations x = 1, y = 2 identify _two_ lines and there is a single point which satisfies this pair of equations, and we denote this point by (1,2) ... which is no surprise.

Now consider the analogous situation in going from 2-D (which is familiar territory) to 3-D.

The 2-D equation x =1 representing a line places no restrictions on z so we slide this 2-D line (in the x-y plane) parallel to the z-axis and sweep out a 3-D plane: hence all points (x,y,z) which satisfy x = 1 lie on this plane. Similarly, y = 2 and z = 3 are planes, and the _three_ equations x = 1, y = 2, z = 3 identify _three_ planes and there is a single point which satisfies this trio of equations and we denote this point by (1,2,3). Surprise!

S: Does that mean I can just take curves I know, like x2 \+ y2 = 1 and y = x2 and so on ... in the x-y plane, I mean ... and just slide them in the z-direction and get a surface?

P: Sure. Try it.

S: Okay, x2 \+ y2 = 1 is a circle and I slide it parallel to the z-axis and I get a ... uh, a cylinder.

P: Right!. In fact, a right-circular cylinder. In fact, every time you take y = f(x) and slide this 2-D curve parallel to the z-axis you always get what's called a "cylinder".

S: So y = x2 is a parabola in 2-dimensions, but a parabola cylinder in 3-dimensions. Nice.

P: A parabolic cylinder.

S: So I already know thousands ... well, dozens of 3-D surfaces, like y = x3 and y = sin x and y = ex and ...

P: But they're all cylinders parallel to the z-axis. What about z = x2? What's that?

S: Easy. I just ... uh, switch z and y and ... uh ...

P: First sketch z = x2 in an x-z plane, then introduce a y-axis and slide your parabola along this y-axis. See?

S: But which way does the y-axis go? I mean, up or down? Does it matter? I like up, myself.

P: That's an interesting point. We have a choice and although it makes little difference we should pick one and agree on it and stick to it. In fact, a common convention is to pick the y-axis down as shown in the right half of the picture ===>>

S: That's really stupid, I mean ...

P: Hold on, let me show you the x-y-z coordinate system from several angles and see if you don't agree that it's a good convention. |

Which way y?

---|---

  | First, it would be nice to consider a curve in a 2-D x-y plane then introduce a z-axis going up, as shown on the left. I think you'd agree with that. Then you can see that, with this convention, the 3 axes can be drawn in several ways (as shown below):

---|---

S: I can't really see what's what. What is the difference. Wouldn't my way be the same ... I mean ...

P: Okay, here's the convention. You rotate the x-axis into the y-axis and a right-handed screw should advance in the positive z-direction. That's how we pick the z-direction. That's the convention.

S: Picture?

P: Okay, here's a picture of the right-handed screw and also a picture of a surface, using that x-y-z coordinate system and you'll note that the surface isn't a cylinder. In fact, you can recognize a cylinder, one variable is missing.

  |

---|---

P: So what's z = x2 \+ y2?

S: A cereal bowl?

Revolving 2-D Curves to get 3-D Surfaces

As well as sliding a 2-D curve y = f(x) parallel to the z-axis and generating a 3-D "cylinder", we can also generate many interesting 3-D surfaces by revolving a 2-D curve about either axis. Let's start with the 2-D parabola y = x2 and revolve it about the y-axis, sweeping out a surface. We want to find an equation satisfied by all points (x,y,z) which lie on the surface generated.

In 2-D, we consider the points F and P as shown ===>>

The relation y = x2 really says that the y-value of P is the square of the distance FP (which, after all, is just x ... in this 2-D plane). |

---|---

Now we revolve the parabola about the y-axis and note that F doesn't move but P sweeps out a whole _circle of points_ and the y-values are all the same on this circle and equal to the square of the radius of the circle: y = (FP)2.

If Q(x,y,z) is a typical point on this _circle of points_ , then FQ = FP and since the y-value doesn't change we have y = (FQ)2 but FQ is the distance from Q(x,y,z) to F(0,y,0) which is =

so we get y = 2 or simply which is the relation we want. In fact, every point (x,y,z) which satisfies y = x2 \+ z2 lies on this surface and because it was obtained by revolving a parabola, it's called ...

S: ... a parabolic cylinder!

P: It's not a cylinder! It's called a paraboloid and you've seen it before except it was written z = x2 \+ y2 because it was obtained by revolving a parabola |

---|---

in the x-z plane, namely, z = x2, about ... you tell me, which axis?

S: Uh ... I'd say you started with z = x2 and revolved about the z-axis.

P: Good. Let's find some simple rule so we can just write down the equation of the surface of revolution as soon as we're given the 2-D curve.

If we go carefully over the analysis given above for y = x2 we see that there are two important things:

If we revolve about the y axis, then y doesn't change ... but x does, and

the value of x2 gets replaced by x2 \+ z2.

That means y = x2 becomes y = x2 \+ z2

S: Hey! Let me do one!

P: Go right ahead.

S: Okay, I revolve y = x3 about the y-axis and I get a 3-D surface whose equation is ... uh, I don't have any x2, so what do I do?

P: Write y = x3 as y2 = x6 = (x2)3.

S: Okay, then the surface is y2 = (x2\+ z2)3. And if I revolve y = x I first write it as y2 = x2 and I get y2 = x2 \+ z2 and if I revolve ...

P: Hold on. What kind of surface do you think y2 = x2 \+ z2 is?

S: Well, I revolved a line y = x about the y-axis so I'd get ... uh, a cone. Right?

P: Good! Let me show you a few more ... but I'll rotate about different axes, but in each case if I rotate a curve in a plane about one axis, that variable doesn't change but the other does. If I revolve about the y axis, y doesn't change ... about the z axis and z doesn't change. Got it?

The ellipse + = 1 The ellipse + = 1 The line z = y The line y = x

revolved about the x-axis revolved about the y-axis about the z-axis about the y-axis

Note that an ellipse revolved about its major (longer) axis sweeps out a surface much like a hot dog bun. When revolved about its minor (shorter) axis, the surface is more like a hamburger bun.

If we start with the hyperbola x2 \- y2 = a2 and revolve about the x- or y-axis we get quite different surfaces:

The hyperbola x2 \- y2 = a2 The hyperbola x2 \- y2 = a2

revolved about the x-axis revolved about the y-axis

S: I hope you don't expect me to remember all this. Besides, you ask me to revolve a parabola or a straight line and so I know what the final surface looks like, but what if you just gave me x2 \- 2y2 \+ 3z2 = 4. What is it?

P: I replace the 2, 3 and 4 by 1 and get x2 \- y2 \+ z2 = 1, so it's a hyperboloid of 2 sheets.

S: Hey! You can't do that!

P: If I ask you to sketch y = 2x2 in the x-y plane, you can just sketch y = x2 then, since y is twice as large (it's 2x2 not x2), you just stretch the curve in the y-direction ... but if you're only interested in the general shape you don't even bother to stretch. See? Look at the first graph above, labelled (A). Can you tell if it's y = x2 or y = 2x2? No! Not until I label some points. In (B) it's clear that it's y = x2 whereas in (C) it's y = 2x2. For 2x2 \+ 3y2 = 4 you can sketch x2 \+ y2 = 1 (replacing 2 , 3 and 4 by 1) and get a circle then stretch it in each direction to get the ellipse 2x2 \+ 3y2 = 4. In 3-D you do the same thing: given x2 \- 2y2 \+ 3z2 = 4 you imagine x2 \- y2 \+ z2 = 1 which is a hyperboloid, then you stretch in each direction and you still get a hyperboloid but now the cross-sections wouldn't be circles but rather ellipses. See?

S: But how much should I stretch it? I mean ...

P: It really isn't important if you just want to know what the surface looks like. You ask for x2 \- 2y2 \+ 3z2 = 4 and I sketch x2 \- y2 \+ z2 = 1 but if I don't put any tick marks on the axes (indicating the scale) you wouldn't know the difference. See?

S: But if you sketch x2+y2=1 I'd sure know it wasn't an ellipse!

P: Then I'm careful to make circles look like ellipses. Look again at the hyperboloids above. Can you really tell if the cross-sections are circles or ellipses? No. They could be graphs of ax2 \- by2 \- cz2 = d (for the 1-sheet type) or ax2 \- by2 \+ cz2 = d (for the 2-sheet type) and "a", "b", "c" and "d" can be any positive numbers. Actually, what's important is the sign of the coefficients, not their size. See?

S: I guess ... but it ain't easy.

Some surfaces, of course, are not surfaces of revolution. For example z = x2 \- y2 can NOT be obtained by revolving some 2-D curve about some line (like a coordinate axis). So what does it look like?

One way to see what the surface is like is to slice it by planes z = 1, z = 2, etc.. That gives us a bunch of cross-sections which are 2-dimensional and if we're lucky we can recognize them. For z = x2 \- y2 all cross-sections with planes z = C gives C = x2 \- y2 which are hyperbolas (in an x-y plane).

If this isn't sufficient to sketch the graph of z = x2 \- y2 we can also slice by planes x = C so the cross-sections have the form z = C2 \- y2: parabolas opening downward (in a y-z plane). Slicing the surface by planes

y = C gives cross-section z = x2 \- C2: parabolas opening upward (in an x-z plane). Because these sections are hyperbolas and parabolas the surface is called a **HYPERBOLIC PARABOLOID**.

This may seem confusing but ...

S: May seem confusing! It is confusing. Do you know how many cross-sections you've got? And how am I supposed to sketch them all in 3-D. I have a hard time in 2-D, with x2 \- y2 = C!

P: Pay attention.

To sketch z = x2 \- y2 we slice the surface by planes z = C and get, for each such plane, a cross-section x2 \- y2 = C which we recognize as a hyperbola. We sketch them in the x-y plane and get ==>>

and (surprise!) they're LEVEL CURVES! In fact, if

z = x2 \- y2 gives the elevation above sea-level at (x,y) then all points which have elevation z = C lie on the level curve x2 \- y2 = C. We also note the direction of increasing C. Note that the level curve x2 \- y2 = C with C < 0 opens in the y direction whereas with C > 0 it |

---|---

opens in the x-direction. Finally, x2 \- y2 = 0 is the pair of lines y = ± x.

Now we tilt the x-y plane back a bit and introduce a z-axis and move these LEVEL CURVES by the amount C (in the +z direction if C > 0 and in the -z direction if C < 0).

This gives a rough sketch of the surface as shown ===>>>

Notice that when C < 0 the level curves are shifted down, below the x-y plane.

Below we show a more accurate, computer-plotted graph of z = x2 \- y2. |

---|---

In the above graph we've shown the various cross-sections with x = C and y = C, each of which is a parabola. The surface looks much like a saddle and indeed is often called a saddle surface. The origin lies at a point on the surface with peculiar characteristics. If the x-axis is east-west and the y-axis is north-south, then moving either east or west will increase your elevation and moving either north or south will decrease your elevation. Later when we consider how to find maxima and minima of functions of two variables, z = f(x,y), we'll return to this surface. The origin appears to give a minimum if you move in the x-direction, but a maximum if you move in the y-direction. In fact, x = y = 0 gives neither a max nor a min but what is called (surprise!) a SADDLE POINT.

LECTURE 14

DERIVATIVES OF FUNCTIONS OF TWO VARIABLES

To generalize the notion of the derivative of a function of a single variable, y = f(x), at the place x = a, we'll repeat the procedure, imagining that we are standing at the point on the curve where x = a, y = f(a). Now we change x by an amount x so the change in y is y = f(a+x) - f(a) and we have f'(a) = = .

We repeat this process in 3-space.

Suppose we are given a function of two variables z = f(x,y) which we imagine as a surface in 3-dimensional space and we are standing at the point on the surface where x = a, y = b so z = f(a,b). Now we change _only_ x by an amount x. The corresponding change in z is z = f(a+x,b) - f(a,b) and we consider .

This is clearly a derivative of sorts although only x has changed. It is called the PARTIAL DERIVATIVE of f(x,y) with respect to x, at the place (x,y) = (a,b) and is denoted NOT by as one might expect, but by ... the "∂" indicating immediately that there are other variables in "f". To indicate that the derivative is at x = a, y = b we may write . In a similar manner we can define the PARTIAL DERIVATIVE with respect to y, at (a,b):

= =

There are other notations for these partial derivatives:

is also denoted by fx and by fy or, sometimes is denoted by f1 and by f2 where the subscripts mean "the first variable" and "the second variable", etc.. This is convenient if f is a function of 100 variables and you run out of letters in the alphabet so you just call the variables x1, x2, ... , x100 and then the notation is awkward so you just call it f100.

It's clear that finding PARTIAL derivatives is no more difficult than finding ordinary derivatives: we just hold one variable fixed while we find the "ordinary" derivative with respect to the other variable.

**Example:** Determine each of the indicated partial derivatives:

(a) (b) (c)

Solution:

(a) = 2xe-3y (b) = cos(xy2) (2xy) (c) = 5x3z4

S: That's sneaky. The last one isn't a function of two variables, it's ...

P: But see how simple it is? Find of anything and you ignore all the other variables and concentrate on z alone and use the old stuff we've already learned about differentiating functions of one variable and ...

S: Yeah, I get it. Easy. But isn't there a picture?

Graphically speaking, if we hold y at the value "b" then the surface z = f(x,y) intersects the plane y = b in a curve which satisfies z = f(x,b), as shown ===>>>

We are at point P(a,b,f(a,b)).

Now we change x from a to a + x and z changes from f(a,b) to f(a+x,b) and this gives two points on the curve of intersection and we then let x0 and get the limiting value of the rate of change and that's what we're calling .

S: And it's the slope of the tangent line, right? |

---|---

P: Sort of. But there are jillions of tangent lines to the surface at the point P and we're finding the slope of just one of

them and we have to be careful about the use of the word "slope". Since y is fixed and only x changes we can look at just the variation of z and x by standing way back along the y-axis and looking at this curve of intersection, and the z-axis is going up and the x-axis is going right and we get a picture like this ===>>>

Then we're back to a function of a single variable again and we know how to compute the derivative ... we just have to find the equation of the curve we see from this vantage point, and a moment's thought tells us that it's z = f(x,b).

S: So is a slope, just like I said!

P: Yes, it's a slope, but pay attention: If I drew a line in the x-y plane |

---|---

and asked for its slope, you'd have no problem with that. You could take two points (x1,y1) and (x2,y2) on the line and compute to get the slope ... or you could measure the angle the line makes with the positive x-direction and compute the tangent of that angle. No problem. Now I draw a line in 3-D space and ask for the "slope". What do you do?

S: Easy! I take two points on the line and find the slope, just like you did in 2-D. Good, eh?

P: Here's two points... (x1,y1,z1) and (x2,y2,z2) ... compute the "slope" of the line joining them.

S: Uh ... well, let's see ... I give up.

P: The point is, it's easy if you let only x change or only y change so you can stand back and view the line in 2-D with the z-axis going up and either the x- or y-axis going to the right, then you're familiar with this situation so you know what is meant by the "slope". But what happens if all three variables change? In fact, that's a problem not just for a line but for a surface. If we're at point P on the surface and only x changes then the rate of change of z = f(x,y) with respect to x is ... but what if both x and y change. How rapidly does z change?

S: Will you be covering that?

P: Yes.

S: I was afraid you'd say that.

Now we'll change only y: at P (where x = a, y = b and z = f(a,b)), if x = a is fixed and _only y changes_ then we move along a curve in our surface: the intersection of the surface z = f(x,y) with the plane x = a. Again we can view the relation between z and y from a vantage point a long way along the x-axis, looking back to see the z-axis going up and the y-axis going right and the curve of intersection and the point P and everything is back to 2-D and we have no problem with "what tangent line?" or "what slope?" The situation is shown below.

**Example:** The pressure of a gas P depends upon its volume V and temperature T according to PV = kT where k is a constant. If P = 1, V = 2 and T = 3, how rapidly is the pressure changing when V alone changes?

Note: if P is measured in _pascals_ and V in metre3, then is measured in pascals/metre3 (pa/m3).

**Solution:** From P = we get = - = - Pa/m3 when V=2 and T=3. How to find k? We know that (P,V,T) = (1,2,3) satisfies PV = kT and that gives k = = so = - pa/m3.

S: That's not the way I'd do it. Since T isn't changing and neither is k then I'd just write PV = a constant and differentiate right away since I've got only two variables and I've done this kind of thing before so I'd use the product rule and I'd get = 0 so P + V = 0 or P + V = 0 and I'd plug in P = 1 and V = 2 and I'd get = - = - . Good, eh?

P: Not bad ... in fact, very clever. And do you know that you never even used the fact that the temperature was T = 3 and you didn't have to find k either. Does this problem sound familiar? Does it say something to you?

S: It says my solution if better than yours.

P: If PV = kT then, for a constant temperature, the relation between the pressure and volume is PV = C where the constant C = kT depends upon the temperature. See? You're on a LEVEL CURVE (as long as the temperature doesn't change ... so we can call it an isothermal). In fact, for varying temperatures, you get a whole family of such curves ... but you know that, because we've done this problem before when we talked about LEVEL CURVES, except we called the variables x,y. |

---|---

**Example:** The area of a rectangular sheet is _length_ = 4, _width_ = 3. Does the area change more rapidly with _length_ or with _width_?

**Solution:** Write A = W L where W is the _width_ and L the _length_. Then, changing only the _width_ W gives a rate of change = L while changing only the _length_ L gives = W. Since W < L, then < and the area changes more rapidly with W.

This is pretty obvious when you think of the changes in area of the rectangular sheet. A small increase in W increases the area (we can call it AW) more than does a small increase in L (which we call AL). |

---|---

**Example:** A square box has sides of length x = 3 _cm_ , and height y = 7 _cm_. If one of x or y is increased by a small amount, which will give the larger rate of increase in volume?

**Solution:** V = x2y and = 2xy = 2(3)(7) = 42 _cm_ 3 _/cm_ whereas = x2 = 9 _cm_ 3 _/cm_ so increasing x provides the greater rate of change of V.

**Example:** The cost per hat of making x hats in a month is C = 6 - _dollars/hat_ (where the cost/hat is less if more are manufactured). The selling price per hat is S = 9 + cos _dollars/hat_ which varies with the month t, t=0 being December and S = $10.00 _/hat_ , an inflated Christmas price when the demand is high, and t = 6 being June where S = $8.00 _/hat_ because sales of hats are poor in the summer ... and we assume that, at these prices, all x hats are sold each month. Investigate the total profit as a function of x, the number of items made in a month, and t, the month when they are sold. (Note that S is cyclic, or periodic, with a period of 12 months.)

**Solution:** The profit per hat is S - C = 9 + cos - = 2 + cos + _dollars/hat_ which had better be positive, so even when S is a minimum (in June, when t = 6) we need 8 - (7 - ) > 0 which it is, fortunately and that makes us happy. We really should check everything to see if this "mathematical model" is at all reasonable. So far it's okay ... but if we make more than 3000 hats in a month the cost per hat is zero ... which means our "model" isn't meant for large hat production! The total profit (for the tth month) is:

P(x,t) = ( _number of hats_ ) ( _profit/hat_ ) = x = 2x + x cos + _dollars_.

Now consider what effect changing either x or t makes on our profit.

= 2 + cos + _dollars/hat_ and = - x sin _dollars/month_. In December, t = 0 and we get

= 3 + which says the total profit increases at the rate of 3.00 _dollars/hat_ plus an additional $for each hat produced that month. Also = 0 _dollars/month_ so the profit isn't changing with time (in December, at least). In March, t = 3 and = 2 + _dollars/hat_ while = - x _dollars/month_ and is negative so the profit, P(x,t), is decreasing with time and this decrease is greater if more hats are made!

S: Aw, c'mon. You make more hats and the profits decrease? I mean ...

P: No, be careful. What's happening is that the profits per month, in the month of March, are negative. Not the profits but , the profit/month. Now, it's this rate of change which depends upon how many hats are made that month: the more hats the greater the decrease per month. See? When you have functions of two variables the rate of change with respect to one may very well depend upon the other. All this makes sense, of course, because from December to June the total profits , P(x,t) = 2x + x cos + , go from P(x,0) = 3x + to P(x,6) = x + dollars so there is a big drop in profits namely, 2x dollars so you'd expect the rate at which P(x,t) decreases as t goes from t = 0 to t = 6 to be larger when x is larger ... because this drop is bigger ... and that's what the partial derivatives are telling us. See?

S: Not really.

P: Then let me give you a picture. I'll assume the number of hats/month, x, is constant and see what happens to the profit as t goes from t = 0 to t = 6, then I'll change the number of hats produced per month and give you another graph of P versus t. That'll clear things up.

If we make x = 100 _hats/month_ the profit is P(100,t) = 220 + 100 cos and if we make x = 50 hats/month the total profit is P(50,t) = 120 + 50 cos _dollars_ (for month number "t") and so on. The graphs look like so:

You'll notice that there is a drastic decrease in profits from December to June when you make lots of hats each month, and that's illustrated by the value of which becomes more negative as x increases.

S: I knew a picture would help. Like I always say, a picture is worth a thousand ...

P: Right! Now let's go on.

S: Wait! How about a picture of P(x,t) in 3-D?

P: Okay, here's a rough sketch ===>>>

See how, for each x-value, the curves are cosine functions of t which decrease more rapidly as the x-value increases?

S: Now suppose we're sitting at a particular point on this surface. That means a particular t-month and a particular x-quantity of hats/month. Where do we go so P(x,t) increases most rapidly? I mean ...

P: I know exactly what you mean. You want to maximize your profits so you'd like to know how to modify x and t to accomplish this. Like being on the side of a mountain and asking "Which direction to increase my elevation most rapidly?" Good luck. The variable "t" isn't something you can change, else you'd arrange for it to be Christmas all the time. |

---|---

S: But isn't "t" called an independent variable and doesn't that mean I can change it at will?

P: No. It means it's independent and does what it pleases. Time marches on ... which reminds me, we should too.

S: I notice that you keep trying to put everything into a 2-dimensional plane ... the x-y plane or maybe the x-z plane or something ... and you rely on what we've done before to find derivatives because you've got a function of just one variable because you're keeping one of your variables constant so ...

P: Yes, yes ... of course. We'd like to reduce the problem to one we've already solved. Did I ever tell you the story of the mathematician and the engineer?

S: Yes, yes ... of course. Anyway, I presume that we can take the derivative of the derivative and get the second derivative, right?

P: Right. Now pay attention:

HIGHER PARTIAL DERIVATIVES

If we hold y fixed and concentrate only on the change in x then is the rate of change of f with respect to x and we can evaluate it at any point P(a,b). If we leave as a function of x and y (without substituting x = a, y = b) then we can differentiate again: which is also denoted by . This is the second partial with respect to x, and we can also evaluate it at P(a,b). In a similar manner we can consider = , the second partial with respect to y. Each has its geometric significance.

  | Consider the surface described by z = f(x,y) and the plane y = b. This plane intersects the surface in a curve which, when viewed from a long way along the negative y-axis (so we see the z-axis going up and the x-axis going right), is described by z = f(x,b). Whereas the slope of this curve is , hence is when x = a, at the point P(a,b), we can see that gives the rate of change of this slope as x changes along the curve, and the sign of will indicate whether the curve is concave up or down. If we substitute x = a we'll get , hence an indication of the "concavity" at P(a,b).

Of course, this is not the concavity of the surface z = f(x,y), but only of this particular curve on that surface. Indeed, it's an interesting question to ask: "What does one mean by the concavity at a point on a surface, z = f(x,y)?"

---|---

Of course, we can repeat this for another curve: the intersection of the plane x = a with z = f(x,y). Then when viewed from a mile or two up the positive x-axis, looking back at the z-axis going up and the y-axis going right, we see the curve z = f(a,y) and and give the slope and its rate of change at any y-value along this curve.

Sometimes this geometrical interpretation of the second partials is useful ... provided you only change one of the variables.

S: But I'd like to know what happens if both x and y change, together. What's the "slope" and what's the rate of change of z = f(x,y)?

P: I'm glad you asked that question ...

LECTURE 15

DIRECTIONAL DERIVATIVES

Suppose we have z = f(x,y) and we're at the point x = a, y = b so z = f(a,b). Now we change x by x and y by y, then z becomes f(a+x,b+y) hence changes by z = f(a+x,b+y) - f(a,b). What do we mean by the "rate of change of z"? It's not the rate of change with respect to x or y alone, but with respect to ... both, somehow.

In fact, just as we climbed a mountain whose elevation was given by z = f(x,y), we'd want to know how quickly z changes with change in our x-y position. We'd take the ratio: then let the change in x-y position go to zero. What do we mean by "change in x-y position"? We mean the distance between (a,b) and (a+x,b+y) and that's so we'd consider the ratio: and we'd let 0.

Although that's what we want, we should find an easier way to compute this limit! In particular, since is a terrible expression to deal with and since it's just the distance in the x-y plane we should give it a name, and because it's small (because x and y are small), the name should reflect this, so we'll call it s.

So far we've moved a distance s = and the change in z is z and we want .

Although x and y are both changing, we can still work in the x-y plane by using LEVEL CURVES again.

In the diagram are some level curves for z = f(x,y), namely f(x,y) = C for various values of C. We're at (a,b) and if we move in the x-direction then the rate of change of z is the limit as x0 of

= and that's . If we move in the y-direction (by an amount y), the rate of change: =. |

---|---

Now we move in an arbitrary direction given by the angles (,) where  is the angle our direction makes with the positive x-axis and  is the angle with the positive y-axis. (Remember that? ... when we discussed parametric equations of lines?)

In the diagram, we've moved from (a,b) a distance s = and we want to investigate the limiting value of = as s0. To this end we consider the changes in x and y in turn, writing

= + where the first fraction has only y changing and the |

---|---

second only x changing.

S: Hold on! How did you know to add and subtract like that! I mean ...

P: I'm familiar with functions of a single variable, so I let the variables change one-at-a-time. I'm at x = a, y = b and I want to get to x = a+x, y = b+y. First I change x, going from (a,b) to (a+x,b) and find the change in z: f(a+x,b) - f(a,b). Now I change y, going from (a+x,b) to (a+x,b+y) and there's another change in z: f(a+x,b+y) - f(a+x,b). The total change is + which, of course, is just f(a+x,b+y) - f(a,b).

Let's look at just the second fraction. If only we had we could take the limit and recognize it as ... but we can write the second fraction as and note that is just cos , so now we can let s0 and get = cos .

We repeat this procedure for the first fraction, writing:

= = sin  and now let y0 so that ... that's the definition of the partial derivative at the point x = a+x, y = b. Now let x0 as well, so and having let both x0 and y0 we also have s0 so we're finished.

S: Huh? Finished? Finished with what?

P: Don't you see? We've now got the rate of change of z in any direction. You don't have to move in the x-direction or the y-direction. You can pick your direction, like north-by-north-east, and that determines the angle  and hence you can compute the rate of change in this direction.

S: And what is it?

P: We need a nice notation for it which looks similar to and except it has to have distance in the -direction, not just the x- or y-direction. Got any good ideas? Remember that it has to have the dimensions .

S: Yeah, call if .

the DIRECTIONAL DERIVATIVE

where "s" is distance measured in a direction which makes an angle  with the positive x-axis.

Some observations about the directional derivative:

• If  = 0, we have the positive x-direction and the above formula gives = as we'd expect.

• If  = , we have the positive y-direction and we find that = .

• For other directions, the rate of change is a combination of each of and , the latter being weighted more heavily if  is closer to , and so on.

• If we stand fixed at a point x = a, y = b, then and are fixed numbers and only cos  and sin  change, and takes the form A cos  \+ B sin  (where A and B are constants) and presumably we could determine the value of  which makes a maximum or minimum ... and we'll do just that in the next lecture. (It's like standing on the side of a mountain and determining the direction in which the elevation increases most rapidly).

• Since cos  and sin  are dimensionless (think of them as the ratio of sides of a triangle), the dimensions of , and are the same : if f is in _hectares_ and s, x and y are in _years_ , then each has the dimensions of _hectares/year_.

**Example:** Determine the rate of change of the given function at the indicated point, in the given direction.

(a) f(x,y) = x2 \+ x sin y at (1,π) in the direction which makes an angle of with the positive x-axis.

(b) f(x,y) = e2x \+ xy at (0,1) in the direction which makes an angle of - with the positive x-axis.

(c) f(x,y) = x3y at (0,0) in the negative y-direction.

(d) f(x,y) = x3y at (0,0) in the negative x-direction.

Solution:

(a) = cos  \+ sin  where  = , and = 2x + sin y = 2 and = x cos y = - 1 at (1,π).

Hence = (2) cos + (-1) sin = 2 - = 1 - .

(b) = 2e2x \+ y = 3 and = x = 0 at (0,1), hence = (3) cos (- ) + 0 = .

(c) = cos  \+ sin  where  = or - .

At (0,0) we have = 3x2y = 0 and = x3 = 0 so = 0.

(d) = cos  \+ sin  where  = π and at (0,0) we have = 3x2y = 0 and = x3 = 0 so = 0.

It seems clear that = 0 at (0,0) regardless of the direction we move, since and are both zero at (0,0). The surface z = x3y must be very flat at the origin.

S: A picture, please.

P: Okay, but let's predict what z = x3y will look like. First off we note that it rises from the origin when (x,y) goes into the first quadrant where x>0 and y>0 ...

S: ... and into the third quadrant too ... and it falls into the other quadrants ...

P: What about the level curves. Do they help?

S: Uh, sure, why not. Do some.

P: You do some.

S: Well ... I'd write z = x3y and let z = C so I'd get LEVEL curves x3y = C and I'd write y = and I'd do a Quick&Dirty sketch ... uh, for small x it looks like ... uh ...

P: It's too simple for Q&D: y = has a vertical asymptote at x = 0 with y = ∞ (assuming C>0) and a horizontal asymptote at y = 0 since y = 0 and it's an ODD function and that's enough to |

---|---

sketch a few curves. Of course, before sketching z = x3y we might also note that if we slice this surface with a plane y = A we'd get a cubic cross-section: z = Ax3 and if we slice it with a plane x = B we'd get a linear cross-section: z = B3y and what would we get if we slice it with a plane z = C?

S: Huh? I haven't the foggiest idea.

P: We'd get the level curves, of course: x3y = C (where C may be positive or negative). If we just rely on the level curves we might sketch z = x3y like so ===>>>

S: What about the big picture ... the big computer picture?

P: Okay, here's a computer-plotted picture ... below: |

---|---

Note, in the diagram, the intersection with the plane y = A (namely z = Ax3), has A < 0 so the curve of intersection ... when viewed with the z-axis going up and the x-axis going right ... looks like z = - x3 (not like z = x3). Note, too, that the entire x-axis lies on this surface since points on the x-axis have coordinates (x,0,0) which clearly satisfies z = x3y for any x-value. Also, the entire y-axis, containing points (0,y,0), satisfies the equation z = x3y as well.

The "flatness" of the surface, near the origin, is evident, so it's not surprising that the rate of change of z is zero in ANY direction.

**Example:** The density of ants at a location (x,y) is given by D(x,y) = K _ants per metres_ 2 where x and y measure distance (in _metres_ ) from the queen ant who is located at (0,0): x is east-west and y is north-south distance. What is the rate of change of ant density at (0,0), in a north-west direction?

**Solution:** = cos  \+ sin  where  = and = - K = - K and = - K = 0 at (0,0). Hence = (-K) cos + 0 = _ants/metres_ 3.

S: Ants per cubic metre? You're kidding?

P: Not at all. If D is measure in ants per metres2 and s is in metres, then is in = ants/m3 _._ Of course, we might also say ants/m2 per m. See?

S: Sounds like you got so many ants per ... uh, volume ... or something. Anyway, you're talking about ants because you think I'll be impressed with how useful this is ... but I'm not. How on earth would you get D(x,y) = K ? If I'm a biologist can I expect somebody to hand me this function and say "find the rate of ...

P: Pay attention.

**Example:** You are standing at the origin (0,0) and the temperature is given by T(x,y) = 100 e-x cos y ˚C at location (x,y), each of x and y being measured in _kilometres_. In what direction should you move so it gets cooler most quickly?

**Solution:** We interpret "cooler most quickly" to mean that the rate of change of temperature should be as negative as possible, in the optimal direction. If "s" measures distance in the direction  then we'd want to be as negative as possible. We have that = cos  \+ sin  = (- 100 e-x cos y) cos  \+ (- 100 e-y sin y) sin  which, at our location (0,0), is = - 100 cos  \+ 0 = - 100 cos . We want  so = -100 _˚C/kilometre_.

S: Sure, sure. Move one km due east and the temperature drops from boiling to freezing ... and this is useful?

P: Pay attention. This is only the rate of change at (0,0), in the x-direction. It changes, you know. In fact, T never reaches 0 on the x-axis because, putting y = 0, T(x,0) = 100 e-x ˚C. However, it does get to 0˚C when y = . Besides, T(x,y) = 100 e-x cos y is only an invention ... so we can practice taking derivatives and interpret rates of change in various directions. Perhaps we can invent a more reasonable temperature variation, and study it more carefully.

**Example:** The temperature of a plate lying in 0 ≤ x ≤ 1, y ≥ 0 is given by T(x,y) = sin πx e-πy. In what direction, from (, ), is the temperature increasing most rapidly?

(The plate is shown in the diagram, together with the point.) |

---|---

**Solution:** If "s" measures distance in the -direction, then = cos  \+ sin  gives

= cos  \+ sin  = e-π/4cos  \- e-π/4 sin  = e-π/4(cos  \- sin ) at the point (, ). Our problem is to choose  so that this rate of change is maximized. Since it's a function of a single variable, , and since we can assume  lies in the closed interval [-π,π] ... which clearly includes every possible direction ... we can just find the critical points of (cos  \- sin ) in this interval and evaluate this function there as well as at the endpoints  = -π and  = π and pick the largest of these values. We have:

= - sin  \- cos  = 0 when tan  = -1 so  = - or  = so we evaluate (cos  \- sin ) at  = -π, π, - and and get the four values 1, 1, = and -. The maximum rate of change then occurs in the direction  =- , and = e-π/4 _˚C/metre_ in this direction (assuming we've got the correct units for T, x and y).

Note the temperature variation if x stays fixed at x = ... or if y stays fixed at y =

Note, too, the surface in 3-D x-y-T space:

LECTURE 16

the GRADIENT

In earlier problems, although we calculated the rate of change of some function z = f(x,y) in a particular direction (first in just the x- and y-directions, then in any -direction), this leads naturally to the question: "In what direction is a maximum or a minimum?" This is actually of more than just passing interest:

• For a steady distribution of temperature (meaning it doesn't change with time), "heat" (measured, say, in _calories/metre_ 2 _/second_ ) flows is the direction in which the temperature decreases most rapidly.

• The electric field (in an electrostatic environment) is in the direction in which the potential decreases most rapidly.

S: I haven't the foggiest idea of what you're talking about!

P: Does it matter? Just listen to the words ... and be impressed.

• The force of gravity acts in the direction of maximum rate of decrease in gravitational potential. (On the earth, it's downward!).

• The "best" climb up a mountain is in the direction of maximum rate of increase of elevation.

• The "best" skiing, down a mountain is in the direction of maximum rate of decrease of elevation (which is directly opposite the direction of maximum rate of _increase_!).

Now, how do we find this direction?

We digress for just a moment to discuss **vectors** (since they will be a convenient way of identifying a direction ... since vectors are good at pointing).

VECTORS

We consider vectors in 3-dimensional space: **V** = [V1, V2, V3] (where we'll use square brackets to distinguish vectors from points). We'll denote by | **V** | **=** the length of the vector **V** ... and, when convenient, we'll write V as the length, dropping the **bold type**. Every vector has both a length and a direction (... except that we'd have some difficulty associating a direction with the "zero vector" [0,0,0] which has zero length). We will find of particular utility those vectors whose length is "1", called "unit vectors". One reason for this is that such vectors are a convenient way of identifying a particular direction in our 3-D space: just construct a unit vector in the appropriate direction and use it to identify that direction? How else would you indicate a direction?

S: How about north-east and south-by-south-west and so on ... or maybe the direction which makes an angle of 30˚ with such-and-such an axis or maybe ...

P: Okay, there are other ways, but you must admit that you do use vectors to indicate a direction. Somebody asks "Where?" and you point. It's not the length of your arm that's important, it's the direction of your arm. Can I go on?

S: Go ahead.

Suppose we were given a direction as "making angles ,  and  with the positive x- , y- and z-axes". What would be a vector pointing in this direction? We need a picture ===>>>

To find, say, the y-component of the vector **V** = [V1, V2, V3] ...with length V ... we drop a perpendicular from the tip of **V** to the y-axis and stare at the triangle formed. The side of the triangle which lies along the y-axis is the y-component and it's clearly V cos . Similarly the x- and z-components are V cos  |

---|---

and V cos  so we have **V** = [V cos , V cos , V cos ] and if we choose the length V = 1 we get a unit vector in the required direction:

S: Wait a minute ... it'll only be a unit vector if its length is "1" and that means = 1 and maybe I didn't pick my angles that way. I mean, what if I choose angles like  = and  = and  = then you get u = [cos cos cos ] = [, ,] and its length is ... uh, and, unless I'm mistaken, that's not "1". Right?

P: Right, and you've just discovered one of the mysteries of the universe. There's no such direction as one which makes  = and  = and  = . In fact, if you pick  = and  = then you have little choice;  must or - or any angle that makes cos  = 0. See? = 1 so cos  = 0. These three angles are related. In fact, the three cosines must satisfy

Direction Cosines satisfy: = 1

S: Direction cosines?

P: Uh ... sorry, that's what they're called, but you've seen them before. When we were talking about parametric equations for a line in the x-y plane, we said:

See? Direction cosines! In fact, we also noted that, in 2-dimensions,  = so that cos  = sin  and that means that cos2 \+ cos2 = cos2 \+ sin2 = 1. See? . The vector u = [cos , cos ], in the x-y plane, is a unit vector. In fact, since V = [x - x0, y - y0] is the vector which goes from (x0, y0) to (x,y) then, using the above parametric equations we have V = [s cos , s cos ] where "s" is the distance between (x0,y0) and (x,y) hence it's the length of V so if s = 1 we'd have a vector of unit length. See how everything hangs together?

S: And comes back to haunt you. I have a question: what's a gradient?

P: Hmmm. Good question. How did you know about gradients?

S: That's the title of this lecture.

For the function f(x,y), the vector grad f = [, ] is called the gradient vector

The gradient vector (sometimes denoted by  f) has a nice interpretation, but before we say what it is we'll do some examples:

**Example:** Compute the gradient of the given function at the given point:

(a) f(x,y) = x2y at (3,7)

(b) f(x,y) = (x-1)2+(y-2)2 at (0,0)

(c) f(x,y) = sin πx e-πy at (, )

(d) f(x,y) = 100 e-x cos y at (0,0)

**Solution:** In each case we use _grad_ f = [, ]:

(a) _grad_ f = [2xy, x2] = [42, 9] at (3,7).

(b) _grad_ f = [2(x-1), 2(y-2)] = [-2, -4] at (0,0).

(c) _grad_ f = [π cos πx e-πy, - π sin πx e-πy] = [e-π/4, - e-π/4] at (, ).

(d) _grad_ f = [- 100 e-x cos y, - 100 e-x sin y] = [-100,0] at (0,0).

All these come from examples considered earlier:

• In (a) we have the rate of change of volume of a square box of side 3 and height 7 when either the side length or height changes ... all wrapped up in a single vector _grad_ f = [42,9]!

• In (b) we have the rate of change of elevation of a mountain (if we're sitting at (0,0)) when either x or y changes. Note that this vector, _grad_ f = [-2, -4], points in a direction with slope = 2 and that's the direction (as we saw earlier) which gives the maximum rate of increase of elevation. Is that an accident?

• In (c) we have the rate of change of temperature in a plate, at the point (, ), and we note that _grad_ f =

[e-π/4, - e-π/4] points "south-east", the direction making an angle  =- with the positive x-direction and that's the direction (as we saw earlier) which gives the maximum rate of increase of temperature. Is that an

accident?

• In (d) we have the rate of change of temperature at (0,0) and we note that _grad_ f = [-100,0] points due west (in the _negative_ x-direction) whereas the temperature decreases most rapidly in the _positive_ x-direction. Is that an

accident?

As you might expect, these are not accidents: _grad_ f = [, ], when evaluated at some point x = a, y = b, gives a vector which points in the direction in which f(x,y) increases most rapidly! (Or, to put it differently, _grad_ f points opposite to the direction in which f(x,y) decreases most rapidly ... as in (d), above.)

**S:** Hold on! Is that some kind of proof? I mean, do a couple of examples prove anything? I mean ...

**P:** No, of course not, but I can prove it. Pay attention. I have to talk a little more about vectors:

A little More About Vectors

Let's talk about 2-D vectors, **U** = [U1, U2] and **V** = [V1, V2] etc. Every such vector has both a length and a direction. The length of **U** , denoted by | U |, is . To get the direction of **U** we could take the ratio

= and that would give us the slope of this vector.

  | We can also add vectors: The SUM of these two vectors is **U** \+ **V** = [U1+V1, U2+V2]. Geometrically, we SUM them by sliding **V** (while maintaining its length and direction!) so its "tail" touches the "head" of **U**. This is _vector addition_ and the SUM is the vector which goes from the "tail" of **U** to the "head" of **V**. To get its length requires the cosine law (assuming we know the lengths and directions of **U** and **V** so we know two sides and the contained angle for the triangle whose third side is the length of **U** \+ **V** ).

---|---

  | We can also subtract vectors: the DIFFERENCE is **U** \- **V** = [U1-V1, U2-V2]. Geometrically, we SUBTRACT them (1) changing the direction of **V** (while maintaining its length!), and that gives - **V** , then (2) finding the SUM of **U** and (- **V** ) ... thereby reducing the problem to one we've already considered!.

---|---

There is a natural question to ask: _Can we multiply vectors?_ The answer is, of course, yes.

S: Of course? Why "of course"? It's not obvious how ...

P: We can define what we mean by "muliplying" U and V in any way we wish. I could, for example, say that the PRODUCT U V is the vector [U1V2, U2V1] ... i.e. [1st component times 2nd, 2nd component times 1st]. Why not? You can invent a product yourself if you'd like. Who can argue? It's a definition, right?

S: Is that the definition of the product?

P: It's not the one I want to talk about. It does, however, have the nice property that if I multiply any vector by the zero vector, [0,0], the result is the zero vector. However, if I want to add U and V then multiply by W = [W1, W2] and I write this as W (U + V) which is W multiplied by [U1+V1, U2+V2] and I use my invented "multiplication rule", I'd get W (U + V) = [W1 (U2+V2), W2(U1+V1)]. On the other hand, if I multiplied U + V by W, writing it as (U + V) W, I'd get (again using my "invention"): [(U1+V1)W2, (U2+V2) W1]. They're not the same ... but that's okay. What is most bothersome about this invented "multiplication" is that ... can you see what's bothersome?

S: It doesn't bother me.

P: I have no idea how to multiply vectors in 3-dimensional space. You see, this recipe doesn't generalize easily. And there are other things I don't like about it. I have no idea if it's useful, and ...

S: Why don't you just tell me what you do like.

the DOT Product

The DOT product between two vectors **U** = [U1, U2] and **V** = [V1, V2] is defined by U•V = U1V1 \+ U2V2. Let's make a fuss about this:

U•V = U1V1 \+ U2V2 is the DOT product between vectors U= [U1, U2] and V = [V1, V2]

**Example:** Calculate the dot product between the following vectors:

(a) **U** = [2, -1] and **V** = [0, 5]

(b) **U** = [2,5,-1] and **W** = [5,0,1]

(c) **P** = [5,1,1] and **Q** = [-1,4,1]

Solution:

(a) **U** • **V** = [2, -1]•[0, 5] = (2)(0) + (-1)(5) = -5

(b) **U•W** = [2,5,-1]•[-5,0,1] = (2)(-5) \+ (5)(0) + (-1)(1) = -11

(c) **P•Q** = [5,1,1]•[-1,4,1] = (5)(-1) + (1)(4) + (1)(1) = 0

S: But the dot product isn't even a vector, right? Doesn't it have to be a vector? I mean ...

P: You're right, it's NOT a vector ... it's a SCALAR (i.e. a number) and for that reason it's sometimes called the SCALAR PRODUCT. And since it's a definition we don't have to justify it ... except we'd look pretty foolish if it were useless. Another thing: notice how it generalizes so nicely to 3-D vectors (as in (c), above)? Just multiply the components, pair-wise, and add up all these products.

S: Why call it the DOT product?

P: Didn't you see the big DOT? Anyway, now I have to show it's important in the sense that this "product" actually occurs in meaningful problems. In particular I want to answer the question: "Does the gradient vector grad f(x,y) give the direction in which f(x,y) increases most rapidly?"

Earlier we mentioned that the length of the sum vector **U** + **V** could be obtained as the length of a side of a triangle, using the cosine law. We'll do that now and something wonderful will happen:

Suppose  is the angle between **U** and **V** , as shown. The length of the sum vector **U** + **V** is given by | **U** + **V** |2 = | **U** |2 \+ | **V** |2 \- 2 | **U** | | **V** | cos . Now we insert the lengths of **U** , **V** and **U** + **V** and get:

(U1+V1)2 \+ (U2+V2)2 = + - 2 | **U** | | **V** | cos , then square the terms on the left-side and cancel and get, finally: . This is quite remarkable. The left-side is just the DOT product, so we have:

U•V = U1V1 \+ U2V2 = | U | | V | cos 

where  is the angle between vectors U= [U1, U2] and V = [V1, V2]

In words, the DOT product of two vectors is a number which can be obtained _either_ by multiplying their components, pair-wise, and adding OR by multiplying the product of their lengths by the cosine of the angle between them. If you know U1, U2 and V1, V2 then you can compute their length hence the angle between these two vectors.

**Example:** Calculate the angle between the following vectors:

(a) **U** = [2, -1] and **V** = [0, 5]

(b) **U** = [2,5,-1] and **W** = [-5,0,1]

(c) **P** = [5,1,1] and **Q** = [-1,4,1]

Solution:

(a) **U** • **V** = -5 and | **U** | = = and | **V** | = = 5 and **U** • **V** = | **U** | | **V** | cos  so we have cos  = - hence  = arccoswhere we choose the angle in 0 ≤  ≤ π. In fact, since cos  ≤ 0, then ≤  ≤ π.

(b) **U•W** = -11 and | **U** | = = and | **W** | = = and **U** • **W** = | **U** | | **W** | cos  so we have cos  = - hence  = arccoswhere, since cos  ≤ 0, then ≤  ≤ π.

(c) **P•Q** = [5,1,1]•[-1,4,1] = (5)(-1) + (1)(4) + (1)(1) = 0 and since **P•Q** = | **P** | | **Q** | cos  where  is the angle between **P** and **Q** , then cos  must be zero, so  = and we conclude that **P** and **Q** are perpendicular to each other!

S: Is that a way to tell if two vectors are perpendicular?

P: Yes. If P•Q = 0 then they must be perpendicular.

S: But what if P = [0, 0]?

P: Oh ... yes, that's ... uh, you're quite right. In that case P•Q = 0 too. Well, if neither P nor Q has zero length (i.e. |P| ≠ 0 and |Q| ≠ 0) then they'll be perpendicular if P•Q = 0.

Whenever we see an expression like A B + C D we can imagine two vectors [A, C] and [B, D] and we can interpret A B + C D as their DOT product and even find the angle between them.

S: Is that useful?

P: Pay attention:

The directional derivative (i.e the rate of change of f(x,y) in the direction ) is cos  \+ sin  and this has the form A B + C D so we can write it as a DOT product between two vectors: [, ] • [ cos , sin ] . We recognize the first of these vectors as none other than _grad_ f. And the second? It's just a unit vector in the

-direction!

grad f • u is the directional derivative in the direction of the unit vector u

P: What does that say to you, about the direction of maximum increase of f?

S: Huh?

P: Can't you see? You're standing at some place and you calculate and at that place. That gives a vector, grad f = [ , ] and this vector points in some direction. Now you want to determine how quickly f changes in some other direction, namely the direction in which u = [ cos , sin ] points. You just take the DOT product grad f • u and that gives you the rate of change in the -direction. So now you vary this -direction (i.e. the direction of the unit vector u), computing grad f • u for each new -direction. Remember that grad f = [ , ] isn't changing. It just points in some fixed direction, waiting patiently for you to rotate u and compute grad f • u again and again until you've found the largest rate of change (i.e. the largest value of grad f • u ). So? In what direction should you move to achieve this greatest rate of change in f?

S: I'd say you should move in the direction of grad f?

P: Excellent! Why do you say that?

S: Well ... it's just sitting there, pointing ... and besides, it's the only direction I can think of.

We've been able to generate various expressions for the rate of change of f in a given direction. It's the most recent expression that answers the question: _"Does the gradient vector grad f give the direction in which f increases most rapidly?"_. We string out these expressions and look carefully at the last. It says something exciting:

= cos  \+ sin  = [ , ] • [cos , sin ]

= grad f • u = | grad f | | u | cos  = cos 

Clearly we obtain the maximum value of by choosing the maximum value of cos  which means we should move in the direction  = 0 which means the angle between **u** and _grad_ f should be zero which means **u** should be in the direction of _grad_ f.

S: Hold on! To get a larger rate of change you just make bigger, right?

P: Wrong. You're standing at some place and you evaluate grad f at that place, meaning you calculate the partial derivatives at that place, and they're just numbers and they give you a fixed gradient vector which points in a fixed direction. I said that already! NOW you vary u and that means you're varying only  which means the rate of change is greatest when ...

S: Okay, okay ... I get it. And I assume that once you move a little ways in the direction of grad f then you have to do this all over again, I mean compute another grad f and then move in that direction and so on and so on.

P: Yes, if you insist upon moving in the direction where f increases most rapidly.

S: But we've done that before, right? In sounds familiar. We were on the side of a mountain as I recall ...

**Example:** You are standing on the side of a mountain whose elevation is given by z = 95 - x2 \- y2 +2x + 4y metres, where x = 0, y = 0 is your location, so z = 95 is your elevation. Sketch the level curves in your neighbourhood and determine in what direction you should climb so as to increase your elevation most rapidly.

Solution:

The direction of maximum increase of z = 95 \- x2 \- y2 +2x + 4y is just the direction of the vector _grad_ z = [ , ] = [-2x+2, -2y+4] which, at (0,0), is [2,4]. The "slope" of this vector is = 2, just as we obtained earlier when we took the direction to be perpendicular to the LEVEL CURVE through (0,0). |

---|---

Earlier, we did another problem where the LEVEL CURVES were given by (x-1)2 \+ 2(y-2)2 = 100 - C (so they were ellipses rather than circles so it wasn't so obvious what direction gave the maximum increase in elevation). Nevertheless, this magic direction is the still the direction of _grad_ f where f(x,y) = 100 - (x-1)2 \- 2(y-2)2. Actually, we could take f(x,y) = - x2 \+ 2x - 2y2 \+ 8y or f(x,y) = - x2 \+ 2x - 2y2 \+ 8y + 15 etc. since adding a constant to f(x,y) doesn't change its partial derivatives and _grad_ f depends only upon these partials! Anyway,

_grad_ f = [-2x+2,-4y+8] = [2,8] at (0,0) and this gives the direction of maximum increase: the "slope" of _grad_ f is = 4 (as we obtained earlier when we took the direction to be perpendicular to the LEVEL CURVES).

S: Are you saying that grad f is always perpendicular to the level curves?

P: You got it. At every point P(a,b) the gradient vector is perpendicular to that level curve, f(x,y) = constant, which passes through P, namely f(x,y) = f(a,b). Let's make a note of that:

The gradient vector grad f = at a point P(a,b)

is perpendicular to f(x,y) = f(a,b), the level curve through P

LECTURE 17

more on the GRADIENT, and the CHAIN RULES

Let's recap:

• Given a function of two variables, f(x,y), the LEVEL CURVES are f(x,y) = C where C is a constant.

• The level curve through a given point P(a,b) has C = f(a,b), so is f(x,y) = f(a,b).

• The rate of change of f(x,y) in the x-direction is , and is the rate of change in the y-direction.

• At each point P(a,b) the gradient vector is defined as:  f = _grad_ f (a,b) =

• The rate of change of f(x,y) in the direction of the unit vector **u** = [cos , sin ], namely the directional derivative (denoted by ), is given by either _grad_ f (a,b) • **u** = cos  \+ sin  or, equivalently, by cos , where  is the angle between _grad_ f(a,b) and **u**.

• The direction of _grad_ f(a,b) is perpendicular to the level curve through P(a,b) (i.e perpendicular to f(x,y) = f(a,b)).

• The maximum rate of change of f(x,y), at the point P(a,b), is in the direction of _grad_ f(a,b) (since this maximum rate of change occurs when cos  = 1, or  = 0, meaning **u** points in the direction of _grad_ f(a,b)).

• The magnitude of this maximum rate of change is .

S: Huh? Where did that come from? I mean, we never talked about what the maximum rate was ... just how to move to achieve it.

P: Since = cos  and the maximum occurs when  = 0, guess what this maximum is?

S: So grad f not only gives the direction but also the size of this maximum increase, right?

P: Right! Since grad f is a vector it has both length and direction. The length provides the maximum rate of change and the direction provides the direction in which this maximum increase takes place. Nice, eh?

S: Mamma mia! You sure get your money's worth from good ol' grad f, right? When we were standing on the side of the mountain, z = 95 - x2 \- y2 +2x \+ 4y and grad z = [-2x+2, -2y+4] = [2,4] at P(0,0) so the maximum rate of increase was = , right?

P: In metres/metre.

S: And when the level curves were ellipses, like f(x,y) = 100 - (x-1)2 \- 2(y-2)2 = constant, then grad f = [2,8] at P(0,0) so the maximum change was = .

P: The maximum rate of change was metres/metre. Right.

S: I think the gradient vector is probably pretty useful, right?

• If T(x,y,z) is the temperature at P(x,y,z) where x, y and z might be the latitude, longitude and elevation of the point P, the heat flows in the direction of maximum decrease in T(x,y,z), namely in the direction of - _grad_ T. In fact, if **H** is the heat flow vector (measured in, say, _calories/metres_ 2 _/second_ ) then **H** = - k _grad_ T where k depends upon the conductivity of the medium in which the heat is flowing.

• If V(x,y,z) is the voltage potential at P, then **E** , the electric field, is in the direction - _grad_ V.

In fact, **E** = - _grad_ V.

• If U(x,y,z) is the gravitational potential at P(z,y,z), then **F** , the force of gravity acts in the direction - _grad_ V.

In fact, **F** = - _grad V._

S: I don't know anything about that potential stuff ...

P: It doesn't matter. Just be impressed with the usefulness of the gradient vector.

**Example:** The distribution of a certain type of plant is given by its density: P(x,y) = 100 sin2x e-3y plants/km2, where x and y measure distance, in kilometres, from some origin. Compute:

(a) the rate of change of P(x,y) at x = 2, y = 1, in both the x- and y-directions,

(b) the magnitude of the maximum rate of increase of plants at (2,1), and

(c) the direction in which this maximum increase occurs.

Solution:

(a) _grad_ P = [ , ] = [ 200 sin x cos x e-3y, -300 sin2x e-3y] = [200 sin 2 cos 2 e-3, -300 sin22 e-3] at (2,1). We have = 200 sin 2 cos 2 e-3 and = -300 sin22 e-3 at (2,1).

(b) The length of _grad_ P is and this is the maximum rate of increase (measured in _plants/km_ 2 _/km_ ).

(c) The direction of _grad_ P gives the direction in which this maximum increase takes place.

S: But what are those numbers ... and that angle?

P: I'll let   do it . I'll define P, then compute Px = and Py = using the  commands diff(P,x) and diff(P,y), then I'll substitute x = 2 and y = 1 and evaluate as a floating (decimal) number (using the subs and evalf commands), then I'll take the length of the gradient vector as and that'll give me the maximum rate of change, then I'll compute and that'll give me the tangent of the angle, then I'll use the arctan function to get the angle, but   is pretty smart and knows that the range of arctan is - to so it'll give me the angle in that range but I notice that my gradient points into the third quadrant so I'll just add π (which   calls Pi) and I'll ask   to evalf this angle:

______________________________________________________________________________

• P:=100*sin(x)^2*exp(-3*y);

2

P := 100 sin(x) exp(- 3 y)

_______________________________________________________________________________

• Px:=diff(P,x);

Px := 200 sin(x) exp(- 3 y) cos(x)

_______________________________________________________________________________

• Py:=diff(P,y);

2

Py := - 300 sin(x) exp(- 3 y)

_____________________________________________________________________________

• Px:=evalf(subs(x=2,y=1,Px));

Px := -3.767897758

_____________________________________________________________________________

• Py:=evalf(subs(x=2,y=1,Py));

Py := -12.34951020

_______________________________________________________________________________

• MaxRate:=sqrt(Px^2+Py^2);

MaxRate := 12.91152414

_______________________________________________________________________________

• TanTheta:= Py/Px;
TanTheta := 3.277559794

_____________________________________________________________________________

• Theta:=arctan(TanTheta);

Theta := 1.274662618

_____________________________________________________________________________

• Theta:=evalf(Theta+Pi);

Theta := 4.416255272

_______________________________________________________________________________

S: That angle isn't in the 3rd quadrant, it's ...

P: That angle IS in the 3rd quadrant. Remember, it's in RADIANS. In degrees it'd be about 4.4or about 252˚.

S: How did Maple know to give it to you in radians?

P: Maple has a PhD in mathematics.

S: Could you plot a few level curves. I'd like to see that grad P really is perpendicular.

P: Be glad to. I'll have to plot 100 sin2x e-3y = C for various values of the constant C and that means e3y = sin2x and that means |

---|---

3y = ln = ln + ln sin2x and that means I can plot y = K + ln sin2x for various values of the constant K. I'd better avoid places where sin x = 0, of course (because of the logarithm), but we want the level curve which passes through x = 2, y = 1 and that means choosing 1 = K + ln sin22 so we'll just choose K-values near this. Here's a plot of a few level curves. Notice that the gradient vector at (2,1) is indeed perpendicular to the level curve through (2,1).

S: And look at how the level curves head off to infinity as x approaches 0 or π ... that's where the sin x = 0 so the ln blows up, right?

P: Yes indeed.

S: But weren't we talking about plants? Are you saying that the density of plants becomes infinite as x approaches π kilometres? Is that reasonable?

P: No, no. Everywhere along the level curve through (2,1) there is a constant density, namely P(2,1). In fact |

---|---

the equation of this level curve is P(x,y) = P(2,1) or 100 sin2x e-3y = 100 sin22 e-3≈ 4.1 so that's the density and what the graph shows is that if you want to maintain the same constant density of 4.1 plants/metres2 you'd have to follow this level curve and that means heading due south as you approach x = π kilometres.

S: So nothing is really infinite, right? But it sure looks that way, right?

P: If that level curve were a picture of a highway taken from an airplane, would you see any infinities? No.

S: One other thing. What happens beyond x = π or maybe x < 0. Are there no plants there?

P: You figure it out.

S: Well ... I'd look at the level curves 100 sin2x e-3y = C or, as you put it, y = K + ln sin2x, and I'd see that ... uh ... sin2x repeats itself so the curve repeats itself so you've just plotted the level curves in 0 < x < π but there would be others and they'd look the same except |

---|---

they'd be shifted right or left by π kilometres so they'd look something like this     ... how'm I doin' boss?

P: Have you taken this course before?

The CHAIN RULES

It is tempting to repeat everything that we did for functions f(x) of a single variable, generalizing to functions of two (or more) variables. Although many of these recipes generalize quite easily (... we've already seen parametric equations of a line in 3-D, and first and second derivatives, and we've even managed to get derivatives in a general direction which was a problem which didn't even arise in single-variable calculus!).

Now it's time to generalize the Chain Rule!

Alas, it's not clear what we should consider. Should we consider z = f(x,y) and each of x and y are functions of two _other_ variables? Or maybe z = f(x,y) and each of x and y is a function of just a single _other_ variable. Or perhaps u = f(z) where z is a function of two variables, such as z = f(x,y) !?^%$*

Let's do the last, first:

**Example:** If u = sin z and z = x2 \+ y2 (so that u is _indirectly_ a function of two variables x and y), then compute and .

**Solution:** We can clearly just substitute for z and get u = sin (x2+y2), hence = cos (x2+y2) 2x and

= cos (x2+y2) 2y ... and we're finished. However, we'd like to identify the "Rule", so we notice that the factor cos (x2+y2) is just cos z which is just and the two pieces 2x and 2y are just and respectively. This means = and = and that's the "Rule". Notice that we needn't write because u is a function of the single variable z and we need only differentiate using our familiar single-variable calculus.

We make a fuss about this new Chain Rule:

If u = f(z) and z = z(x,y) then = and =

Notice how the terms are formed. If u is measured in degrees, z in metres, x in kilograms and y in seconds, then is measured in _degress/kg_ and so is because it's ( _degrees/metre_ ) ( _metres/kg_ ). That observation alone is almost enough to generate this Chain Rule. How else would you combine the various derivatives

_degrees/metre_ , _metres/kg_ and _metres/second_ to form _degrees/kg_ ?

Another observation: derivatives (even partial derivatives) are "rates of change" and indicate how rapidly one variable changes compared to another: compares the change in u with the change in z whereas compares the change in z with the change in x. Hence, if = 6 it means that u changes 6 times more rapidly than z, while = 7 means z changes 7 times more rapidly than x. So what's , which tells how rapidly u changes compared to changes in x? Obviously it's (6)(7) = 42.

The Canadian humorist Stephen Leacock knew all about the chain rule when he wrote of A working twice as hard as B who worked twice as hard as C.

This Chain Rule: common sense couched in mathematical jargon.

**Example:** Compute:

(a) if u = ez and z = sin x e-2y

(b) if T = u2 and u = _ln_ x ex+y

(c) if V = tan w and w = T2V + cos(VT)

(d) if P = eu2 and u = x2 \+ y2z2 \+ z sin t

Solution:

(a) = = ez (b) = = 2u

(c) = = sec2w

(d) = = 2u eu2

S: That's pretty easy ... I think, but does one example prove this new chain rule?

P: Pay attention. I'll go through it, but with different variable names:

Suppose that P = f(u) and u depends upon a host of other variables, one of which is z. Let z change by an amount z so that u will change by some amount, call it u (which we'll assume isn't zero because we're going to divide by u !), and that'll make P = f(u) change by some amount, call it P. The partial derivative we seek is

and we simply write = then take the limit as z0 (and that'll make u0 as well) so we get, in the limit, where we use "d" when we're differentiating a function of a single variable and "∂" when we're differentiating a function of many variables. That gives us our Chain Rule.

Of course we've been considering the case where P depends upon a single variable u which in turn depends upon many others. It's like saying the pressure P of a bottle of gas depends upon its temperature u (and nothing else), but the temperature u depends upon the location of the bottle ... for example, x = latitude, y = longitude and z = elevation, because these three variables will determine the temperature. In asking for we are asking for the rate of change of pressure (within the bottle) as the elevation increases (assuming a change in elevation will change the temperature, u).

S: You began with the title "Chain Rules". That's plural. I assume ...

P: Yes, there are others ... lots of them.

Suppose, now, that P depends upon many variables and each of these depends upon a single variable. In a sense this is the opposite of what we've just considered. We write P = f(x,y,z) where x = x(t), y = y(t), z = z(t). It's like saying the pressure of our bottle of gas depends upon its location (x,y,z), but we're in an airplane so the location is changing with time, t. In fact, x = x(t), y = y(t), z = z(t) are parametric equations for the path of the plane!

Now how does the pressure P change with time? We want .

We change time from t to t + t, causing a change in position from (x,y,z) to (x+x,y+y,z+z), hence a change in pressure from P(x,y,z) to P(x+x,y+y,z+z): call this change P. What we want is =

. There's a trick here and we've done it before. Remember?

S: Nope.

P: When we have too many things changing at once, like x and y and z?

S: Nope, my mind is a complete blank.

P: Remember when we wanted to find the rate of change of f(x,y) is some direction and both x and y were changing? What did we do?

S: Aah, yes ... we changed them one-at-a-time. Go ahead, do it!

We write the total change in P, namely P = P(x+x,y+y,z+z) - P(x,y,z), as the sum of changes when each of x, y and z change one-at-a-time, and we'll call the individual changes Px, Py and Pz:

Px = P(x+x,y,z) - P(x,y,z) is the change in P when only x changes, and we move from (x,y,z) to (x+x,y,z).

Py = P(x+x,y+y,z) - P(x+x,y,z) is the change when we move from (x+x,y,z) to (x+x,y+y,z).

Pz = P(x+x,y+y,z+z) - P(x+x,y+y,z), when we move from (x+x,y+y,z) to (x+x,y+y,z+z).

Then P = P(x+x,y+y,z+z) - P(x,y,z) = Px \+ Py \+ Pz hence = + + and we need only compute each of these three limits, as t0, which shouldn't be so bad because only one variable is changing!

We'll start with .

Note that = = where again we used the same trick as we used when considering directional derivatives, because now each of and has a recognizable limit.

We get = .

In a similar manner we can compute the limits of and .

The sum of the three limits we'll call rather than because there is only a single independent variable which all others depend upon:

If P = f(x,y,z) and x = x(t), y = y(t) and z = z(t), then = + +

Although you may be tired of this dimensional stuff, I'll repeat it anyway:

If P is measured in _hectares_ and x, y, z in _kilograms_ , _metres_ and _seconds_ and t in _years_ , then is measured in _hectares/year_ as are each of the three terms. For example, = ( _hectares/metre_ ) ( _metres/year_ ).

**Example:** The air pressure at a point (x,y,z) is given by P(x,y,z) = K (3-2sin x) (1+cos2y) e-z _pascals_ , where K is some constant and x, y and z are measured in _kilometres_. If the point moves along a curve descibed by x = t2, y = _ln_ (1+t), z = t, where t is the time in _hours_ , then compute at the time t = 2 hours.

**Solution:** When t = 2, we are at the point: x = 4, y = _ln_ 3 and z = 2 and moving so that = 2t = 4,

= =.5 and = 1 _km/hour_.

Further, = K (-2 cos x) (1+cos2y) e-z = K (-2 cos 4) (1+cos2( _ln_ 3)) e-2 ≈ .214 K and

= K (3-2sin x) (- 2 cos y sin y) e-z = K (3 - 2 sin 2) (- 2 cos( _ln_ 3) sin( _ln_ 3)) e-2 ≈ - .495 K and

= - K (3-2sin x) (1+cos2y) e-z ≈ - .737 K.

Hence = + + = - .131 K _pa/hour_.

LECTURE 18

another CHAIN RULE

S: I have to tell you that the title scares me!

P: But suppose we have z = f(x,y) and each of x and y depend upon, say, two other variables u and v. Then we'd have z = f(x,y) and x = x(u,v), y = y(u,v) and when either u or v change, so will x and y, hence so will z, so we might want to compute or maybe .

S: When would that problem ever occur ... outside of this calculus course?

P: Well, maybe z = f(x,y) and we want to switch to polar coordinates r and  so x = r cos  and y = r sin  and that means x and y depend upon two other variables so that z ...

S: Yeah, I get it ... but that still sounds like a "math" problem, not a "real" problem.

P: Okay, suppose H(x,y) is the hardness (measured in some convenient unit!) at a point (x,y) in a flat sheet of material. Suppose, too, that a laser beam is cutting the material and the x- and y-position of the laser beam is controlled by two gears which rotate, the angles of rotation being given by  and 2. Then we'd have H(x,y) with x = x(1,2) and y = y(1,2). Now we want to compute the rate at which the hardness changes when, say, 2 changes. Got it?

S: Sounds like a pretty fishy "real" problem to me.

P: We're doing all this good stuff so if you ever run across a "real" problem you can at least say: "I remember doing problems like that." Of course they won't be exactly like the ones I invent, but is that really important? Techniques ... that's what we're learning. When you learn to play scales as a preamble to playing the piano, do you say: "Will these little songs ever occur outside of these piano lessons?"

S: Yes.

Consider the following problem:

z = f(x,y) where x = x(u,v) and y = y(u,v) so that z is a function, indirectly, of both u and v. We want to compute . That means we change only v (leaving u fixed) and deduce the change in z and take the limit and, to do this, we let x and y be the changes in x and y when v changes by v and write (as we've done before!) the change in z as z = zx \+ zy (these being the changes when x and y change one-at-a-time) so we get

= + = + and now we can take the limits of all four parts and recognize the various partial derivatives (being careful to use "∂" rather than "d" when we're differentiating a function of more than a single variable) and we get: = + and, in a similar manner, we can get: = + .

If z = f(x,y) and x = x(u,v), y = y(u,v) then

= + and = +

I think that's enough on Chain Rules. I think you can see how it goes. If K = f(x,y,z,u,v) and each of these five variables depend upon p, q, r and s, then = + +++and so on and so on and so on and so on ...

**Example:** Compute the indicated derivative(s):

(a) and if z = x2y3 and x = v sin u, y = u2ev

(b) if P = and V = (x2 \+ y2) sin πt

Solution:

(a) = + = + and

= + = +

(b) = =

In this last example, we write because P is a function of the single variable, V. There's no harm in writing of course. If someone asks "Why are you using the partial derivative?" you can always say "I'm keeping C constant." In fact, when dealing with a host of variables, it's not always clear from the notation just what is being held fixed. If P is pressure and V and T are volume and temperature, then it's often convenient to use T to indicate a rate of change in pressure at constant temperature. You see this convention in thermodynamics which deals with gases ... among other things ... and every conceivable partial derivative T or P or V seems to have its own name! Indeed, a book on thermodynamics is cover-to-cover partial derivatives.

Let's collect these Chain Rules:

a COLLECTION of CHAIN RULES

If u = f(z) and z = z(x,y) then = and =

If P = f(x,y,z) and x = x(t), y = y(t) and z = z(t), then = + +

If z = f(x,y) and x = x(u,v), y = y(u,v) then = + and = +

**P:** When you see = + + does it remind you of anything?

**S:** Uh ... it's something like ... uh ... no, it doesn't.

**P:** It has the form AB + CD + EF, doesn't it?

**S:** Yes! The DOT product, right? In fact, AB + CD + EF = [A, C, E] • [B, D, F], the DOT product between these two vectors. Am I a genius or what?

**P:** Very good. Now write = + + as a dot product.

**S:** Easy. It's • .

**P:** These two vectors, and , do they remind you of anything? Do you recognize them? Think.

**S:** One's the gradient ... but in 3-D, and the other is ... is ...

**P:** The velocity of a moving point (x,y,z), assuming t is the time.

**S:** So I can write = _grad_ P • **V** where V is the velocity. But what does it mean?

**P:** It depends upon the problem. If the position of a plane is changing with time, say x = x(t), y = y(t), z = z(t), and P(x,y,z) is the air pressure at the position (x,y,z), then the plane will experience a changing pressure and it will change at the rate _grad_ P • **V** where _grad_ P is the local gradient (at the position of the plane) and **V** is the plane's velocity vector at the time t. Nice, eh? In fact ...

**S:** In fact I can tell you how the plane should move so as to experience the greatest increase in pressure! We'd want **V** in the direction of _grad_ P. Not only that, but the length of _grad_ P would actually give this maximum rate of change of pressure. In fact I can tell you right now that the maximum value of is

. What a genius!

**P:** Well ... not quite: is measured in _pascals/hour_ whereas _grad_ P is measured in _pascals/kilometre_ and so

is so you need to multiply the length of _grad_ P by something with the dimensions of ... what?

**S:** That's confusing.

P: No it's not. To get _pascals/hour_ you multiply _pascals/kilometre_ by _kilometres/hour_. Where do you get this _kilometres/hour_?

**S:** Uh ... _kilometres/hour_ ... that's **V** , right?

**P:** Right. Notice that = _grad_ P • **V** = | _grad_ P | • | **V** | cos  where  is the angle between these two vectors and to get the maximum you fly your plane so  = 0 and that gives a maximum ...

**S:** The maximum is | _grad_ P | • | **V** | ... the length of _grad_ P times the length of **V**. Nice!

**P:** Now let me show you something interesting ...

Directional Derivatives, revisited

When we considered the rate of change of f(x,y) in some -direction (making an angle  with the positive x-axis), we measured distance in this -direction with the variable "s" and obtained, eventually:

. Now let's do the following. We let z = f(x,y) and let x and y move from (a,b) in the -direction along the line: x = a + s cos , y = b + s sin . (Parametric equations for the line through (a,b) with direction ). To compute the directional derivative we want but now we have a Chain Rule to cover this situation: z = f(x,y) and x = a + s cos , y = b + s sin  gives = + = cos  \+ sin  which is precisely the result we obtained before ... which gives us a warm feeling.

S: Hey! Why do you call it ?

P: That's what we've been calling it.

S: But you should write it as , right? I mean, we're talking about a function of the single variable "s", right?

P: Uh ... you have a point there. Yes, I guess we should call it . Hmmm. Why didn't I think of that? Well, I suppose it made sense when we started all this since we were buried in variables and ...

S: Excuses, excuses.

P: Okay, from now on I'll call it . Happy? Now watch this and I'll show you something really wonderful:

Implicit Differentiation, revisited

Way back when, we wanted to find given some relation f(x,y) = C which didn't define y explicitly as a function of x ... so we invented "implicit differentiation". Watch carefully as we go through the steps:

**Example:** Determine if x2sin y + ex _ln_ (xy) = 1

**Solution:** Differentiating { x2sin y +ex ( _ln_ x + _ln_ y) = 1 } gives

x2 cos y + 2x sin y + ex _ln_ (xy) + ex \+ ex = 0.

Now we collect terms:

{x2 cos y \+ } \+ {2x sin y + ex _ln_ (xy) + } = 0.

Now we solve for = - .

P: Do you recognize any of the terms, like maybe the numerator or denominator?

S: Wait ... uh, no.

P: The numerator is just and the denominator is .

S: Huh? Who's z?

P: Oh, sorry, z = x2sin y + exln (xy) and we're on some level curve z = 1, and if we want the slope of the tangent line to this curve we can find by implicit differentiation OR (and this is the nice part) we can just use: = - .

To determine implicitly from f(x,y) = C, use = -

S: So ... one example proves it, right?

P: Watch this. We have z = f(x,y) = C and we're going to stay on this level curve, so if we change x then y must change in some particular way in order to keep f(x,y) equal to C so we don't move off the curve. In other words, y must follow the level curve, so y is a function of x and of course x is a function of x (what else?). Okay, we write this out like so:

z = f(x,y) and x = x, y = y(x). See? We've said that y is a function of x by writing y = y(x) and just by looking at these equations you can see that we've got a Chain Rule going here. We have the function f(x,y) where each of x and y is a function of x. Hence: = + = 0 because z = 1 so clearly = 0. (If we don't have this we don't stay on the level curve so when we get some it won't be the slope we're looking for!) Now we put = 1 and solve for = - and voila! A simple expression for implicit differentiation. Now go to all your friends and defy them to beat you in the game of Implicit Differentiation.

the GRADIENT vector is normal to the level curve

We've been saying for some time now that:

(1) if we stand at the point P(a,b) on a level curve f(x,y) = C, namely the level curve f(x,y) = f(a,b), and

(2) we evaluate _grad_ f at P(a,b) on this level curve, namely , then

(3) this gradient vector is normal to the curve.

We're now in a position to do something more than just illustrate this with examples and pictures.

On the level curve f(x,y) = C the slope of the tangent line at P(a,b) is where each of and is to be evaluated at (a,b). On the other hand the gradient vector is and has a direction characterized by the slope: . Just stare at those two slopes. They're negative reciprocals! That means _grad_ f is perpendicular to the tangent line. In other words, _grad_ f is NORMAL to the level curve!

There's another way to look at this. We can write ...

S: Another way? Why another way?

P: I really don't like equations like the above which are good in 2-dimensions but it's hard to see what the analogous equations would be in 3-dimensions, or 4 or 5.

S: Okay, forge ahead!

One awkward thing about the "tangent line": it isn't clear what we should do in 3-D since there are infinitely many tangent lines to a surface at a point P(a,b,c) on the surface so we won't be able to say "the gradient has a slope perpendicular to the slope of the tangent line". On the other hand there is only ONE normal direction to the surface (well ... two actually, pointing in opposite directions!). It would be nice if we could show that, for the _3-dimensional analogue_ , we'd still have _grad_ f pointing in a normal direction. Of course, for the _3-dimensional analogue_ we must consider "level curves" f(x,y,z) = C rather than f(x,y) = C and these aren't even curves but surfaces ... but that's only a change in wording, so we can call f(x,y,z) = C "level surfaces" if we wish.

Now, is we stand at a point P(a,b,c) and compute the vector _grad_ f = , will it be perpendicular to the level surface which passes through P, namely f(x,y,z) = f(a,b,c)? The answer is YES! In fact, it will be perpendicular to every tangent line to the surface at the point P(a,b,c)! In fact, it will be perpendicular to ... to what?

S: Huh? I don't understand the question.

P: In order to be perpendicular to every possible tangent line the gradient vector must be perpendicular to what? Tell me what contains every single tangent line.

S: I still don't understand ...

P: The TANGENT PLANE at P(a,b,c)!!

LECTURE 19

the TANGENT PLANE

We must first investigate the nature of planes else how would we recognize one when we see its equation? To see what form the equation of a plane must have, we go over lines in 1-dimensional calculus ... carefully, trying to put things in a form which generalizes to 3-dimensions.

S: Why lines? Why not parabolas or circles or ...

P: Because the 3-dimensional analogue of a line is a plane! Pay attention. It'll be clear soon enough.

S: Come now. Lines are lines and the 3-D line is just as much a line as a 2-D line so why isn't the 3-D line just a line. I mean, why is the 3-D analogue a plane? It doesn't make sense. I mean ...

P: Pay attention!!

Recall that the equation of a line in 1-variable calculus has the form Ax + By = C. If the line is to pass through a given point P(x0, y0), then Ax0 \+ By0 = C so this gives the required value of C, and all lines through (x0, y0) have the form Ax + By = Ax0 \+ By0, or, to put it more elegantly:

.

We immediately (!) recognize this as having the form of a DOT product: the DOT product between the vector [A, B] and the vector [x - x0, y - y0].

Who are these vectors?

Since P(x0, y0) is a fixed point on our line and (x,y) is a variable point, then the vector [x - x0, y - y0] just points along the line, from P(x0, y0) to (x,y). It's a "tangent vector". In fact, if we take we get which, of course, is the slope of the line (and the slope of the tangent vector!).

Now, how about [A, B]? Since the DOT product A (x - x0)+B ( y - y0) = 0, then [A,B] must be perpendicular to the tangent vector [x - x0, y - y0], hence [A, B] must be a NORMAL vector!

In fact, if we write the equation of a line in the form y = mx + k, then put it into the form Ax + By = C (i.e m x - y = - k), the vector [A, B] is [m, -1] and this is a normal vector. In fact, the slope of this normal is

= - which is certainly the negative reciprocal of the slope of the line.

Also, the equation of the tangent line to a curve y = f(x) is y = f(a) + f'(a) (x - a) and, in our "standard" form Ax + By = C, this reads f'(a) x - y = constant so a NORMAL vector is [f'(a), -1] which, again, has slope - , perpendicular to the line.

In general if we wish to construct a line we can begin with a point (x0, y0) and a NORMAL direction [A, B] and ask "Where are all the points (x,y) which make [x - x0, y - y0] perpendicular to [A, B]?" The answer? They must satisfy A (x - x0)+B ( y - y0) = 0, and that's the equation of a line.

Now on to planes in 3-D ... and we'll see why _planes_ are the natural 3-D analogue of the line:

We consider a point (x0, y0, z0) and a NORMAL direction, say [A, B, C] (which, of course, is now a vector in 3-D with 3 components). "Where are all the points (x,y,z) which make [x - x0, y - y0, z - z0] perpendicular to [A, B, C]?" The answer? They must satisfy A (x - x0)+B ( y - y0) + C (z - z0) = 0, and that's the equation of a plane.

The equation of a plane through (x0, y0, z0) with normal vector [A, B, C] is

A (x - x0)+B ( y - y0) + C (z - z0) = 0

Okay, now that we recognize the equation of a plane when we see it, let's find the TANGENT plane to some surface z = f(x,y,) at the point P(a,b,c) where, of course, c = f(a,b) (else our point wouldn't lie on our surface!). First of all, it must have the form A (x - a)+B ( y - b) + C (z - c) = 0. This represents every possible plane through (a,b,c).

Who are A, B and C?

S: The normal to the plane!

P: Yes, yes,of course, but if we have the surface z = f(x,y) and the point P(a,b,c) how do we calculate A, B and C?

How to find A, B and C so the plane through (a,b,c) is "tangent" to the surface z = f(x,y)?

One thing we've known for some time is the slope of two particular tangent lines to the surface z = f(x,y).

If we look at the intersection of the surface z = f(x,y) with the plane x = a (from a vantage point along the positive x-axis, looking back, so the z-axis goes up and the y-axis goes to the right) we see a curve z = f(a,y) and the tangent line to this curve must lie in our plane A (x - a)+B ( y - b) + C (z - c) = 0. In fact, since we're in the plane x = a, every point on this tangent plane must satisfy B (y - b) + C (z - c) = 0 which is where the tangent plane intersects the plane x = a ... and this intersection is the tangent line to the curve z = f(a,y) at the point P. But we already know how to calculate the equation of the tangent line to z = f(a,y) when y = b. It's z = f(a,b) + , or to put it into a more elegant form, it's \+ (-1) (z - c) = 0 where we've put f(a,b) = c. Comparing with B (y \- b) + C (z - c) = 0 we see that B and C must be proportional to and (-1) respectively. Or, to put it differently, if we find from B (y - b) + C (z - c) = 0 by implicit differentiation ... giving B + C = 0 ... it must be the same as and that means that .

S: Whoa! Proportional you say? Why aren't they equal? I mean, why isn't B = and C = -1?

P: I give you the equations 2 x + 4 y = 8 and x + 2 y = 4. They're the same line, right? (I'm talking about lines in the x-y plane.)

S: Yeah, they're the same line.

P: So are the coefficients exactly the same? Is 2 = 1 and 4 = 2 and 8 = 4? No. They're just proportional. After all, you can multiply the equation of a line by some constant k and get the same line and all the coefficients will get multiplied by k so the new coefficients won't be the same but they will be proportional with constant of proportionality equal to k.

S: Okay, okay, okay!

Now, having obtained B = k and C = - k we go to our other tangent line ... the one which we get by intersecting z = f(x,y) with the plane y = b.

We repeat the procedure: the tangent plane A (x - a)+B ( y - b) + C (z - c) = 0 intersects the plane y = b in the line A (x - a) + C (z - c) = 0 which must therefore be the tangent line to the curve z = f(x,b) at P, namely

z = f(a,b) + or, more elegantly, + (-1) ( z - c) = 0 where again we've put f(a,b) = c, and again we compare the two equations and see that A is proportional to and C proportional to -1. Or, to put it differently, if we find from A (x - a) + C (z \- c) = 0 by implicit differentiation ... giving A + C = 0 ... it must be the same as and that means that .

We collect all our results:

\- = and - = and that means that A = - C and B = - C so the tangent plane A (x - a)+B ( y - b) + C (z - c) = 0 becomes - C - C - C (z - c) = 0.

We then have:

The TANGENT PLANE to z = f(x,y) at the point P(a,b,c) is:

+ \- (z - c) = 0

or z = f(a,b) + +

See how the equation of the tangent _plane_ mimics the equation of the tangent _line_ to y = f(x), in 2-D?

versus

It seems clear that a "surface" u = f(x,y,z) in 4-dimensions has a tangent "plane" at x=a, y=b, z=c given by

u = f(a,b,c) + + + and so on and so on and so on ...

S: Question: there seems to be a lack of symmetry here. I mean, why does z get the (-1) and x and y get the partial derivatives. I mean ...

P: Very nice observation! In fact we've singled out z for special consideration simply by writing z = f(x,y). That destroys the symmetry right there! It's like finding the tangent line to y = f(x) and writing it as y = f(a) + f'(a) (x - a) or "more elegantly" (trying to regain the loss of symmetry): f'(a) (x - a) + (-1) (y - f(a)) = 0. See? y gets the (-1) and x gets the derivative. On the other hand, if you had started NOT with y = f(x) ... which singles out y for special consideration ... but collected the x's and y's together, writing f(x,y) = C, we'd get a more symmetrical equation for the tangent line. For example we start with x2 \+ y2 = C and ask for the tangent line at some point on this circle and we don't "solve for y" but leave it in this symmetrical form. Do you know the equation of the tangent line at some point (a,b) on x2 \+ y2 = C?

S: Uh ... it shouldn't be that hard. Let's see ... I have the point (a,b) so I need the slope and I get that by implicit differentiation, but now I'll impress you by using = - where f = x2 \+ y2 and I get = - = - at the point in question so the tangent line is = - which is the same as ax +by = a2 \+ b2. Hey! That's pretty nice!

P: And, of course, a2 \+ b2 = C because (a,b) is on the curve x2 \+ y2 = C. But check it out ... your equation, I mean.

S: Huh?

P: Does (a,b) lie on your line? Are the dimensions correct? At the point (a,0) do you get a vertical line? At (0,b) ...

S: Yeah, okay ... let's see .. if everything is in metres then "ax" and "by" and "a2+b2" are all in metres2 so that checks ... and, uh ... everything else checks out too. So what's the equation of a tangent plane in a symmetrical form?

P: Figure it out yourself.

S: Okay, pay attention:

Rather than writing z = f(x,y) we write F(x,y,z) = C. That's a surface, but all variables are collected on the left of the equation and everything is nice and symmetrical. We're at a point P(a,b,c). We want the tangent plane at P. We have the point and we now need ... uh ... a direction, the NORMAL direction, the vector [A, B, C], then we plug into A (x - a) + B (y - b) + C (z - c) = 0. Okay, the normal to F(x,y,z) = 0 is ... is ... well, I'll generalize from the 2-D problem. For the level curve f(x,y) = C, the normal vector is so for the 3-D level surface F(x,y,z) = 0 the normal will be so our tangent plane is + + = 0.

P: Excellent! Except that the partial derivatives, etc., must be evaluated at the point P(a,b,c). They're numbers, after all. The only variables in your equation should be the x, y and z in (x-a), (y-b) and (z-c). And notice what happens if you start with z = f(x,y) and put everything on the one side of the equation? You'd get F(x,y,z) = f(x,y) - z = 0 so the normal would be

and that's just so you can see where the (-1) comes from!

The tangent plane to the surface F(x,y,z) = C at the point P(a,b,c) is

+ \+ = 0

S: I've noticed that you don't have too many pictures. What ever happened to your picture-worth-a-thousand-words?

P: And I've noticed you haven't asked if this will be on the final exam.

LECTURE 20

OPTIMIZATION

S: I kind of like that symmetry stuff, like ax + by = C is the tangent line to x2 \+ y2 = C at P(a,b). What else can you do with symmetry?

P: Well ... let's see you solve this problem. You are to construct a rectangular wall of 100 metres2 and around this wall you'll put a gold border. Gold is expensive, so you want the length of the border to be a minimum. What should be the dimensions of the wall?

S: Easy! I let the width and height of the wall be x and y so I want L = 2x + 2y to be a minimum; that's the border. There are two variables so I look for a relation between them, and it's xy = 100 because that's the required area. That means y = so the length of the border is L = 2x + and I minimize this function of a single variable, x. First I find = 2 - and see that it's negative when x is small because - is large and negative. In fact, for small x, L ≈ and that's a decreasing function. Then I see that is positive for x large because - is smaller than 2. In fact, for large x, L ≈ 2x and that's an increasing function. That means the graph goes down, then up, so when = 2 - = 0 I have my minimum and that means that x2 = 100 so x = 10. Since xy = 100, then y = 10 too. It's a square wall. The width is x = 10 and the height is y = 10.

P: Okay, now suppose you call the width y and the height x. That's the opposiite to the last time. Do the problem again.

S: You're trying to tell me something, right? Okay, I have L = 2x + 2y just like before and xy = 100 just like before and ... isn't everything going to be the same?

P: Precisely! You'll get y = 10 and x = 10 and how could you get anything else? The problem is perfectly symmetrical in x and y ... so the answer must be symmetrical too ... so x and y have to be the same. You could have predicted this without any calculus at all. In fact, if you lie on your side and look at the wall, the height looks like a width and the width looks like a height. In fact, if you were just talking about a rectangle, who's to say what's the height and what's the width? Now suppose you're building a closed box of side lengths x, y and z and suppose the volume had to be 100 metres3 and the amount of cardboard had to be minimized. Then you'd want the area of all six sides, namely A = 2xy + 2yz + 2zx, to be minimum and you'd want the volume to be xyz = 100. Now find the dimensions which minimize A.

S: They're all the same, right?

P: Sure! And I didn't even tell you which was the height or width or length. It makes no difference. These labels are ours. The math doesn't know height from width. If the math says "x = 5, y = 5 and z = 4" and you build the box then you have to decide what's the height and width and length. When you're finished I can turn the box on its side and now I've got a different height, width and length. Hence, unless there are several solutions to the problem, the answer must be symmetrical in x, y and z and that means a cubical box ... and that means "x = 5, y = 5 and z = 4" is wrong, you made a mistake ... and you don't need calculus to see that. Can you imagine solving this problem for a box manufacturer and you tell the boss "x = 5, y = 5 and z = 4" and he says you're all wet because it's symmetrical so x = y = z and he didn't even graduate from high school! See? Don't rely too heavily on the math: it isn't as smart as people.

S: Can I solve that problem? I mean the box problem. We've got too many variables, I mean ...

P: I'm glad you asked that question ...

Whereas we've reproduced many of the ideas of 1-variable calculus, generalizing to higher dimensions, one we haven't tackled is the optimization or max/min problem. We'll do the "box problem" to see what difficulties arise:

**Example:** A 100 _m_ 3 box is to be made with minimum cardboard. What are its dimensions?

**Solution:** If the side lengths are x, y and z then the amount of cardboard is A = 2xy + 2yz + 2zx, but these variables are related by xyz = 100. If we put z = then we still have A = 2xy + 2y + 2x which is a function of two variables. _Any_ (positive) values for x and y will give a box, and the volume will definitely be 100 _metres_ 3 if we choose z = , so x and y are definitely "independent" variables ... so what's the best choice for x and y ... so A is a minimum?

We stare at A = 2xy + + and realize that it's a surface and what we want is the lowest point on this surface ... where we must ignore any parts of the surface where x < 0 or y < 0.

S: It's symmetrical in x and y so x = y!

P: Yes, but we want to generate a methodology for solving this type of problem and they won't all be symmetrical. In fact, if one side of our box was open, then that'd destroy the symmetry.

What characterizes the minimum point on this surface, A = 2xy + + ? Perhaps we should first agree on what we _mean_ by a "minimum" and maybe that'll help. By a minimum we mean a point P(a,b) where, if x and/or y change, then A(x,y) gets larger ... or at least doesn't get smaller. That means the surface is flat at a minimum point. We're in a valley ... mountains all around ... every direction is up. How can we say that mathematically?

S: The tangent line is horizontal ... every tangent line is horizontal ... I mean the tangent PLANE is horizontal!

P: And how do we say that ... mathematically?

S: The normal points straight up!

P: And how do we say that ... mathematically?

S: Uh ... why not compute the normal vector and see where it points straight up ... at what (x,y) points?

P: Not the normal vector ... but a normal vector. Remember, there are an infinite number of them, although they all point in the same direction ... or the opposite direction. When we compute a normal vector using partial derivatives, we're just getting one normal. What we want is for it to point up ... or down ... it doesn't matter which. Wherever this happens we have a candidate for a maximum or a minimum.

For A = 2xy + + a normal vector is = [2y - , 2x - , -1] and it must have zero x- and y-components in order to point up or down. (In fact we have little choice; the z-component is -1 so this normal points down whether we like it or not!) Hence we want and and that gives us two equations in two unknowns to solve for x and y (and if there's more than one solution we're in trouble!).

We have y = from the first and x = from the second and substituting "y" into the second gives

x = or x (x3 \- 100) = 0 so either x = 0 or x = 1001/3. Clearly x = 0 will not provide our minimum area (nor can it provide a volume of 100 _m_ 3!). The solution is x = 1001/3. Since y = , that gives y = 1001/3 as well. Finally, since z = , that gives z = y = 1001/3 as well. (Now we check to see that xyz = 100 ... which it is.)

S: Why does the math give that x = 0 solution? I know that it doesn't make sense for our box because that'd make y = infinite. But the surface must look pretty weird, right?

P: Let's plot it. In fact, I won't plot A = 2xy + + , I'll plot z = xy + + . It'll have the same features. In fact, we'll first try to predict what it'll look like. When I intersect this surface by planes x = k, I get a curve z = ky + + which, in the y-z plane looks like the hyperbola z = when y is small, and like the line z = ky when y is large. As we move along the x-axis, slicing the surface by planes x = k with ever increasing k, the curve of intersection continues to look like z = when y is small, but the line z = ky has larger and larger slope. ("k" is the slope!) I predict it'll look like the graph below.

S: Okay, let's see it.

P: Note that the edges of the surface, in the diagram, are intersections with planes x = A or y = B (for certain constants A and B). The former is z = Ay + + and the latter is z = Bx + + . Each has that characteristic flavour I mentioned ... a hyperbola when x or y is small and a straight line when they are large. Nice, eh?

S: So why does the math give x = 0 as a place where the normal is straight up? That was my original question, remember?

P: It doesn't. For A = 2xy + + the normal vector is [2y - , 2x - , -1] and when either x or y is small it has a very large and negative x- and y-component so it's almost parallel to the x-y plane. In other words, since the surface rises very steeply, the normal vector is almost horizontal.

S: But when you put these x- and y-components equal to zero ... so the normal would go straight up ... or straight down ... and you solved the equations, you got x = 0!

P: And y = ∞! That's NOT a solution!

S: But why did the math do that to us ... uh, to you?

P: Okay. We asked "At what points (x,y) are 2y - = 0 and 2x - ?" In trying to solve these two equations we got x = which seems to imply a solution x = 0 ... but that's NOT a solution to these two equations. I know, I know ... why did the math do it to us? If you want to see something even scarier, we could write these two equations in the form x2y = 100 and xy2 = 100 so that means that x2y = xy2 which can be written xy (x - y) = 0 so x = 0 OR y = 0 OR x = y. It's only the last that gives a solution to the original equations! You have to be careful when you manipulate equations. You could introduce "solutions" which don't satisfy the original equations. If we think that xy (x - y) = 0 has the same solutions as the pair of equations: 2y - = 0, 2x - then we're mistaken. Of course some students would get more solutions, like x = 0, y = ∞ and they'd justify these solutions by substituting into 2y - = 0 and get 2(∞) - = ∞ - ∞ = 0 so it satisfies that one, then they'd substitute into 2x - and get 2(0) - = 0 - 0 (because everybody knows that = 0) and it satisfies that one too. That's why I like to "check it out" when I get a solution. It has to be reasonable. Maybe I can check the dimensions. Maybe I can estimate the answer. Maybe I can solve the problem in a different manner. Maybe ...

S: Okay ... gotcha.

P: Before we forget, let's indicate how we solved this optimization problem:

To maximize or minimize z = f(x,y), we looked for points where the NORMAL vectors are parallel to the z-axis (pointing straight up or down). Since the normal vector is [, , -1], we look at points (x,y) where the x- and y-components are zero (and that will give only a z-component and it will then point up or down). That is, we solve the two equations in two unknowns: . We make a note of this:

To maximize or minimize f(x,y), look for points where = = 0

S: Just "look at points"? That's all?

P: If we solve these two equations and find a solution (x0, y0) we can't guarantee that we have a minimum or maximum.

S: But if the normal points up ...

P: Let me give you an example. Remember the hyperbolic paraboloid? Its equation is z = f(x,y) = x2 \- y2. Now we solve = 2x = 0 and = -2y = 0 and we get the point (0,0). The normal does indeed point up ... actually it points down ... I have to remember that ... it's [, , -1] and the z-component is negative meaning it points down. Anyway, if you stand at the origin and move in the x-direction you go UP the surface but if you move in the y-direction you go DOWN the surface. It's flat at the origin but you don't have either a maximum or a minimum.

S: How about a picture?

P: Okay, but look at the origin. It's a so-called SADDLE POINT. Go in one direction and the surface goes down. Go in an orthogonal direction and it goes up. It's neither a maximum nor a minimum.

S: Is there some kind of test? A first derivative test or something?

P: We can, of course, look at the partial derivative with respect to x, that's = 2x, and notice that it's negative for x < 0 and positive for x > 0 but that only tells us that the surface comes down then goes up as x increases. The other partial, = - 2y, is first positive (for y < 0) then negative (for y > 0) so the surface goes |

---|---

up then down, as y increases. That tells us that we have neither a maximum nor a minimum at the origin. We're lucky. Had they both given us curves which are concave up, and that would be the case if > 0 and > 0 at x = y = 0, then we still couldn't tell if we had a minimum. It only says that the surface curves up when you leave the origin in either the positive or negative x- or y-directions ... say east-west or north-south. But what about all the other directions?

S: Are there such surfaces?

P: Sure. Just imagine laying a bedsheet on the x-y plane and imagine the x axis going east-west and the y-axis going north-south. Then get four people standing somewhere along the positive and negative x- and y-axes: east and west and north and south of the origin. Now get them to lift the sheet. See? The surface you get will curve up when you leave the origin in either a north-south or east-west direction. Now let the entire x-y plane fall away (except the x- and y-axes of course, else we'll lose our four helpers). Now the sheet will still curve up in the direction of the axes, but will curve DOWN if you head, say, north-west or south-west and so on.

S: A picture is worth ...

S: So you're telling me that there isn't any test for a maximum or minimum. That I have to just look to see what happens to the surface when I go east-west or north-south. Is that it?

P: As a matter of fact there is a test ... a "second derivative test" ... which sometimes works, but I'd rather you didn't use it so I won't tell you what it is. What I want you to do is inspect the surface near the points where the partial derivatives are zero. See if it goes up in every direction (a minimum)... or down in every direction (a maximum) ... or up in some directions and down in others (a saddle point).

S: But how can I consider every direction, all at once?

P: You could start at (x0, y0), a place where = 0 and = 0, then you can move in some -direction using the parametric equations for a line in 2-D, namely x = x0 \+ s cos , y = y0 \+ s sin , then see how f(x,y) behaves as s increases or decreases. If, for every angle  the value of f(x,y) increases, you've got a minimum. See?

S: Sort of ... but isn't that complicated?

P: Sort of, but only because of the terms s cos  and s sin . It's easier if you just put x = x0 \+ u, y = y0 \+ v and see if every "u" and "v" will make z increase. But you only have to consider very small values of "u" and "v" and that makes it easier because you can neglect u3 compared to u2 and v3 compared to v2 and so on. In other words, you can approximate the values of your function, when u and v are really small, to see if the values increase or decrease.

S: Sounds tough.

Example:

Examine z = xy + + for a minimum, in x > 0, y > 0. (There is no maximum because z ∞ as x 0+)

**Solution:** We set = y - = 0 and = x - = 0, solve and obtain the solution x = y = 1, in which case z = 3 is the expected minimum value of z. Now substitute x = 1 + u, y = 1 + v into z and get:

z=(1+u)(1+v)++. Since we are only interested in the behaviour of z when u and v are very small (to see if z is always larger or smaller than 3), we can write = 1 - u + u2 \- u3 \+ ... and = 1 - v + v2 \- v3 \+ ... (obtained by dividing into 1, or by recognizing the sum of an infinite geometric series) and get:

where the "..." indicates quantities which are very much smaller than u2, uv and v2 (like u3, for example). We conclude that, for u and v very small (meaning x and y are each near 1), z behaves like 3 + u2 \+ uv + v2 so that it _does_ have the value z = 3 when u = 0 and v = 0 (meaning x = 1, y = 1). The question is: "Is u2 \+ uv + v2 always positive?" If so, it means that z > 3 for all points near u = 0, v = 0 (meaning x = 1, y = 1 gives a minimum). The answer to this question is "Yes". In fact, u2 \+ uv + v2 can be rewritten as (u + )2 \+ v2, where we've "completed the square". In this form we see u2 \+ uv + v2 as a sum of squares, so it is always positive, so z does have values larger than 3 when u and v are small (meaning x and y are each near 1).

We conclude that x = y = 1 provides a minimum value for z = xy + + .

S: It's not always that easy, right? I mean, it's not always that easy to see how z behaves when u and v are small.

P: Let's do another example. But first, remember the scheme:

(1) Set = 0, = 0 and solve these two equations in two unknowns.

(2) Near each solution, (x0, y0), investigate the behaviour of f(x,y) by putting x = x0 \+ u, y = y0 \+ v and retaining only the dominant terms (ignoring terms such as u3 or u2v etc. which are small in comparison to u2 or uv or v2).

(3) The expression for f will look like where everything else has been ignored.

(4) Investigate A u2 \+ B uv + C v2, completing the squares or otherwise, to see if it's always positive for every u and v ... in which case you've got a minimum since then f(x,y) > f(x0, y0), or perhaps it's always negative (in which case you've got a maximum) or perhaps it's positive for some u,v and negative for others (so you have a SADDLE point).

Example:

Examine f(x,y) = 5 - x2 \+ 2x - y - for extreme values (maximum or minimum).

Solution:

We set = -2x + 2 = 0 and = - 1 - y3 = 0 and get a single solution x = 1 , y = -1 and we wish to know if

f(1, -1) = is a maximum or a minimum. Hence we put x = 1 + u, y = -1 + v (where u and v are the deviations of x and y from the values 1 and -1) and get f(1+u, -1+v) = 5 - (1+u)2 \+ 2(1+u) - (-1 + v) - 4. As before, we can simplify this considerably, for small values of u and v, by ignoring "small terms", so we expand and write:

and now we ignore the v3 and v4 terms compared to the v2 term (when u, v are small!) and get f ≈ - u2 \- v2 (when u, v are small!) so it's clear that f has values _smaller_ than for all small values of u and v, so is a maximum. In this case we didn't even have to "complete the squares" to see that - u2 \- v2 was always negative. If, however, a "uv" term appears, then "completing the squares" almost essential.

S: Are you saying that A u2 \+ B uv + C v2 can never be negative, or maybe never positive?

P: Sure. Look at u2 \+ uv + v2, from an earlier example. It's always positive, regardless of what you plug in for u and v (except u = v = 0 of course, where it's zero). But u2 \+ 2uv - v2 is sometimes positive (try u = 2, v = 1) and sometimes negative (try u = 1, v = 2) and -u2 \- uv - 2v2 is always negative. That means that when you put x = x0+u, y = y0+v into f(x,y) and get something like f(x,y) ≈ f(x0, y0) + A u2 \+ B uv + C v2 (after ignoring the smaller terms) then you can tell if f(x,y) > f(x0, y0) or f(x,y) < f(x0, y0) or neither of these just by investigating A u2 \+ B uv + C v2. Nice, eh?

S: Yeah, but what if you get terms with u and v, like maybe f(x,y) ≈ f(x0, y0) + A u + B v + ...?

P: You won't, trust me. In fact, if you remember the equation of the tangent plane to z = f(x,y) at (x0, y0), it's

and that's an approximation to the surface z = f(x,y) near x = x0, y = y0. However, the exact values of f(x,y) would really have to be written

where the "..." contained everything you ignored and would have stuff involving (x - x0)2 and (y - y0)2 and (x - x0) (y \- y0) which, of course, is just u2 and v2 and uv so we could write z = f(x0, y0) + u + v + A u2 \+ B uv + C v2 \+ ... where now the "..." includes stuff like u3 and u2v and even smaller terms, but if we now consider what happens when (x0, y0) is a point where both partial derivatives are zero, then, near such a point we'd have:

and you can see that there won't be any terms in "u" or "v". See? You won't believe this, but the above expression is really a Taylor series about (x0, y0) and we're neglecting the higher terms and retaining only the quadratric terms, as an approximation, just to see whether f(x,y) is always larger or smaller than f(x0, y0) near the point (x0, y0). See? In fact, having said that, you might expect that the numbers A, B and C can be expressed in terms of the second derivatives of f(x,y) at the point (x0, y0) and you could generate a "second derivative test" for a maximum or minimum by investigating the sign or A u2 \+ B uv + C v2. See?

S: Hmmm.

P: Don't worry, I only say these things to indicate how everything hangs together so nicely \--- and in case you were wondering "what about Tayor series for functions of several variables?" because they're really useful in determining the local behaviour of functions and that's just what we need to investigate local maxima and minima. See?

S: Hmmm.

P: Well, it's time to move on ...

Least Squares Fit

It often happens that we make several measurements, plot the points on a graph and notice that they almost lie on a straight line. We might, for example, measure a persons height as a function of his/her age and we'd obtain pairs of numbers (h1, a1) , (h2, a2) , etc. where the height was h1 when the age was a1, and so on. Or perhaps we'd measure the temperature of some chemical solution as a function of time, obtaining a temperature T1 at time t1, etc. Or maybe the cost of production as a function of the number of items manufactured, getting (C1, N1), (C2, N2) etc. |

---|---

In each case we'd have a number of points: (x1, y1), (x2, y2), etc., and we'd plot them and they might look like they should lie on some straight line. What line? Clearly we want to find the "best" line, giving the "best" fit to the data points ... which, of course, brings us to what we mean by "best".

A popular definition of "best line" is that line which "minimizes the squared error" ... whatever that means.

Suppose we have points (x1, y1) etc. and we wish to find the "best" line: y = mx + k, meaning we must determine the "best" values for "m" and "k". At x = x1, our data point has y = y1 whereas the line has y = m x1 \+ k so the error is (mx1+k - y1). At the next point, x = x2 and our data point has y = y2 but the line (whatever line we invent!) has y = mx2+k so the error here is (mx2+k - y2). If we add all the errors we'd get (mx1+k - y1) + (mx2+k \- y2) + ... for as many data points as we have. Perhaps we can choose "m" and "k" so as to minimize this total error: it is, after all, a function of two variables (m,k) and we now know how to do that. Alas, this is a bad choice for our "best" line since some errors may be positive and some negative and they'd cancel even though the line was no where near the data points ... and we'd be very unhappy with a "minimum" error of - ∞ (and we couldn't do better than that minimum, could we?).

We need a better definition of "best" and it should be such that the error somehow gets larger as the line moves farther from the data points and we don't care whether a data point is below the line, giving a positive error (mx1+k - y1), or above the line where this error would be negative. So we take the squares of the errors and that's what we'll minimize.

This definition of "best" has several nice features:

(1) The error is never negative; its minimum value is zero which is exactly what we'd like for the minimum error (meaning every point lies precisely on the line ... if we manage to reduce the error to zero!).

(2) It makes no difference whether a point is 0.1 units below or above the line: it makes a _positive_ contribution to the error.

(3) Perhaps the most telling feature is that the math is easy! (We could also have chosen the sum of the _absolute values_ of each error, but the math would be messy.)

**Example:** Determine the line of "least squared error" to fit the points (.2,.6), (.6,.9), (.9,1.5) and (1.2,1.7).

Solution:

  | If the line has equation y = mx + k, the four errors are :

m(.2)+k - .6, m(.6)+k - .9, m(.9)+k - 1.5 and m(1.2)+k - 1.7 and

the sum of their squares is:

(.2m+k - .6)2 \+ (.6m+k - .9)2 \+ (.9m+k - 1.5)2 \+ (1.2m+k - 1.7)2

which is a function of two variables m and k and we can set the two

partial derivatives to zero. If we call the squared error f(m,k), then:

= 2(.2m+k-.6)(.2)+2(.6m+k-.9)(.6)+2(.9m+k-1.5)(.9)

+2(1.2m+k - 1.7)(1.2) = 0 and

= 2(.2m+k-.6)+2(.6m+k-.9)+2(.9m+k-1.5)+2(1.2m+k - 1.7) = 0

and this gives two equations in two unknowns, m and k.

We divide each equation through by 2, then rewrite them as:

---|---

and

.

We've left things as they occur, without multiplying, so we can see what to do when there are 44 or 144 data points, instead of just 4. In general, for "n" data points, we'd get:

and

.

Anyway, now that we see what the general equations are, let's proceed with our example.

We have:

and

so that, from (2), k = which we substitute into (1) to get

2.650 m + 2.9= 4.050 hence m = 1.174 and k = .324 and our "best" line is:

The graph at the right shows how well the line fits the data. |

---|---

Let's solve the "general" equations so we can use the results for future problems. To make things simpler, we'll use a short-hand SIGMA notation, writing

and

so the general equations read:

and .

We solve for m and k and get:

the "least squares" straight line fit is y = mx + k, where

m = and k =

For our example above, we might make up a table like so:

x2 x y xy

.04 .2 .6 .12

.36 .6 .9 .54

.81 .9 1.5 1.35

1.44 1.2 1.7 2.04

∑x2 = 2.65 ∑x = 2.9 ∑y = 4.7 ∑xy = 4.05

where the middle two columns are the given data points.

Then m = = = 1.174 and

k = = = .324

S: Whew! I hope you don't expect me to memorize all that!

P: No, but you should be able to generate those equations on your own. After all, it's just squaring each error, adding them all, then a couple of partial derivatives then solving just two equations in two unknowns. Besides, look how nice the result is. If the data points are measuring temperature at various times ... that is, y1 and y2 etc. are measured in degrees ... and x1, x2 etc. are in minutes so your measuring temperatures every few minutes ... then each term in the equation y = mx + k must have the same dimensions, namely degrees (the same as y), so "k" is in degrees and so is "mx" and that means that "m" is measured in degrees/minute (it is, after all, m = = the rate of change of y with respect to x). Now look at each ∑ in the equations above:

∑x2 is in minutes2, ∑x is in minutes, ∑y is in degrees and ∑xy is in minute-degrees.

Then k = has the dimensions of = degrees, as it should, and

m = has the dimensions of = , as it should.

S: But what if you only have two points, like (x1, y1) and (x2, y2). Then there is a line which goes right through those points. Will the "least squares " line do that ... I mean go through those two points?

P: Yes. You'll get m = which is the appropriate slope, and you also get the appropriate k. In fact, the line you get can be written = as you'd expect.

S: But you'd be in big trouble if the denominator is zero, right? I mean, suppose you had points which made

n ∑x2 \- 2 = 0. Then m has a zero denominator and so does k. Right?

P: It won't happen. In fact the denominator is always greater than zero. In fact, if you look at the equations for "m" and "k" and divide numerator and denominator by n2 you see things which look like averages. For example, you'll see

∑x = which is the average of the x-values, and which is the average value of x2 and

, the average value of xy and so on. If we give them names, like = <x> and =<x2> and = <xy> and so on, then after we divide numerator and denominator by n2 we can write:

Now you won't believe this but <x2> \- <x>2 = - 2 is nobody else but .

That is, - 2 = so it's a sum of squares so it's always positive ... and never zero. I leave that as an exercise for the student ... and guess who's the student?

S: But what if the data points don't look like they lie on a line?

P: Then assume some other function instead of y = mx + k. For example, you might expect populations to grow exponentially, so you'd assume y = K ebx where K and b are constants. Maybe they look like they lie on a parabola, so you'd assume y = a + bx2 where "a" and "b" are constants. In each case you'd find the sum of the squares of the errors and it would be a function of the constants K and b, or a and b, or whatever. Then you'd set the partial derivatives to zero and solve for these constants. See? There will be as many equations as there are unknown constants. Sometimes they'd be easy to solve, as was the case for the least squares line ... sometimes they'd be difficult.

a difficult Example:

Data points (.1, 10), (.2, 16), (.3, 25) are given. Find equations for the least squares exponential fit: y = K ebx.

Solution:

The sum of the squared errors is E(K,b) = (Ke.1b \- 10)2 \+ (K e.2b \- 16)2 \+ (K e.3b \- 25)2, a function of the two variables K and b. We set

and

.

These terrible equations can be simplified somewhat by letting e.1b = x so e.2b = x2 and e.3b = x3 (so we're lucky the x-values were equally spaced!). Then we still get two terrible equations

and .

These look like and where _p_ 5 is a polynomial of degree 5, _p_ 2 is a poly of degree 2, and so on. To eliminate K, multiply (1) by _p_ 3 and (2) by _p_ 5 and subtract, giving _p_ 3 _p_ 2 _\- p_ 5 _p_ 0 _= 0_ , a polynomial of degree 5 in the variable x = e.1b. To solve this we could use Newton's Method.

S: That's awful!

P: Why? Because it's a lot of work or because you don't understand what's happening?

S: Because it's a lot of work.

P: I'm glad you said that. But remember, if you want to apply mathematics to real-world problems you'll often need some functions which describe, say, the profit as a function of number items manufactured and time of year --- and so on. Where do you think these functions come from? Nobody will walk up to you and say: "Please maximize my profit: P(x,y) = 5 - x2 \+ 2x - y - ." You only find these ridiculous problems in text books.

S: Like this one?

P: Well ... yes. But in real life, you'll need to derive some functions, often from observations --- the data points we've considered above \--- and that often means finding the "best fit" --- and that may take time but you can have some faith in the function because it's the best fit to actual observations.

.......

Anyway, you may be interested to know that this course is over.

S: Hmmm ... too bad.

SOLUTIONS TO "ASSORTED PROBLEMS"

1. At the point of inflection, t = 2 and F(2) = K/2, hence K = 2 F(2) = 2 (10,000) = 20,000 fish. To simplify, let y = so the growth is described by: y(t) = where y0 = . We have the following :

When t = 2 then y(2) = = = so . When t = 3 then

y(3) = = = so . We now have two equations in two unkowns.

Solve for er = 3, hence r = _ln_ 3, and y0 = , hence F(0) = K y0 = 20,000(1/10) = 2,000 fish (at t=0).

2. (a) The DE is LINEAR, so write it in "standard form": + y = - . Now evaluate = 2 _ln_ x and multiply the DE by the "integrating factor" e3 ln x = e ln x3 = x3 and get x3 \+ 3 x2 y = \- 2 x3 and the left-side is exactly ... (meaning that we've calculated the integrating factor correctly!). The DE now reads: = - 2 x3 so x3 y = \- x4 \+ C ... integrating both sides. The solutions are: y = - + .

(b) The DE is LINEAR so we multiply by exp() = ex2 and get :

ex2 \+ 2 x ex2 y = x ex2 or ex2 \+ C so the solutions are: y = + C e-x2.

**Note:** The DE is also "separable: write \+ 2 x y = x as = x dx and integrate each side, giving: - _ln_ | 1 - 2y | = + C which we could (if we really wanted to) solve for y = + K e-x2 (as before).

3. Differentiate the DE to get 10 x + 4y + 4x \+ 4y = 0 or = - . The Orthogonal Trajectories will satisfy = .

4. f(x,y) = constant gives 1 - x2 \+ y2 = constant which is the same as: y2 \- x2 = constant, a family of hyperbolas (diagram at right). |

---|---

5. If z = x2+xy then, when x = 1, y = 2, we have z = 3 and = 2x + y = 4 and = x = 1 . The tangent plane is then (4)(x-1) + (1)(y-2) + (-1)(z-3) = 0, or 4x + y - z = 3 ... where we've used the standard equation

\+ + (-1) (z - f(a,b)) = 0 for the tangent plane to z = f(x,y) at (a,b).

6.

(a) . Discussion: the exponential function πn is much larger than nπ, when n is large (... think of 3x versus x3) so the series looks like and this converges ... even if we sum the absolute values (... think of ∑ which behaves like ∑ e-n where the terms will certainly go to zero quickly enough for convergence) ... so we omit the alternating series test and consider, directly, whether the original series converges absolutely ... so we test , comparing it to . In order effect this comparison we need the ratio of terms to approach "1", so consider = 1 - . To have a limit of "1" we need = 0. This has the form so we can use l'Hopital's rule: = . We continue applying l'Hopital's rule until we get = and now we have the form so the limit is "0" ... and we conclude that will converge (or diverge) depending upon whether converges (or diverges), but this is a geometric series with common ratio which is less than 1 so it converges, so converges, so converges absolutely.

(b) . Discussion: we write this as e + e2 \+ e3 \+ ... and recognize a geometric series with common ratio "e" which is larger than "1", so our series diverges. (Write out a few terms ... it's often useful.)

(c) . Discussion: for large "n", the terms look like = hence they don't have a limit of zero so the series diverges. To get "full marks", write = = ≠ 0 hence series diverges.

(d) . Discussion: the terms of this series are even smaller than those of ∑ which converges (it's a geometric series with common ratio < 1), hence the partial sums are increasing and bounded by hence the partial sums converge ... and that's the definition of convergence for the original series (and since every term is positive, it clearly converges absolutely!).

(e) . Discussion: for large "i" the terms look like = so they don't have a limit of zero. Consider, then, = which has the form , so the limit is ∞, so the series diverges.

(f) . The terms are + + ... = 0 + 0 + ... so the series converges to 0. (It pays to write out a few terms if you don't have a feel for what the terms look like!)

(g) . The terms are - + - + + - ... so it's the alternating harmonic series ... and since the terms _decrease to zero_ , the series converges ... but not absolutely because + + + + ... is the divergent harmonic series.

7. (a) = + C gives: - e-y = + C or y = - _ln_

(b) = = gives _ln_ || = 2x+C or y =

(c) = gives _ln_ |y-1| = + C or y = 1 + K ex2/2 (where K = ± eC)

(d) = 2 gives _ln_ |y| = 2 _ln_ |x| + C = _ln_ x2 \+ C or |y| = eC x2 or y = K x2 (where K = ± eC)

8. (a) = = = - cot t = - 1 when t = π/4 and

L = = 2 = 4π (the circumference of a circle of radius 2)

(b) = =- tan t = -1 at t = π/4

and L = 4

= 12a = 12a

= 12a dt = 12 a [ ]= 6

(c) = = - cot t = 0 at t = π/2

and L =

 |

---|---

9.

(a) f(a) = sin a = 0; f'(a) = cos a = 1; f''(a) =-sin a = 0; f'''(a) = -cos a =-1; f'(4)(a) = sin a =0; f(5)(a) = cos a =1; f(6)(a) =-sin a=0

Then P6(x) = f(a)+f'(a)(x-a)+f''(a)(x-a)2/2!+f'''(a)(x-a)3/3!+f(4)(a)(x-a)4/4!+f(5)(a)(x-a)5/5!+ f(6)(a)(x-a)6/6!

gives sin x = x - + + ...

(b) Use f(n)(a) (from part (a)), evaluate at a = π/2, and get: 1, 0, -1, 0, 1, 0, -1

Then sin x = 1 - + - + ...

Note: put x = t + and get sin(t + ) = cos t = 1 - + - \+ ...

(c) f(n)(a) = ea = e (for a = 1) so we get:

ex = e + e (x-1) + e + e + e + ...

Note: this gives ex-1 = 1 + (x-1) + + + etc. ( i.e. et = 1+t+++ ... with t=x-1

(d) f(a) = _ln_ (1+a) = 0; f'(a) = 1/(1+a) = 1; f''(a) = -1/(1+a)2 = -1; f'''(a) = 2/(1+a)3 = 2!; f(4)(a) = -3!/(1+a)4 = -3!;

Then f(x) = f(a)+f'(a)(x-a)+f''(a)+f'''(a)+f(4)(a)+...

gives _ln_ (1+x) = x - + - + ...

Note: = 1 - t + t2 \- t3 \+ so

_ln_ (1+x) == = x - + - +... (very nice!)

(e) f(a) = tan a = 0; f'(a) = sec2 a =1; f''(a) = 2sec2a tan a = 0; f'''(a) = 2sec2a (2tan2a+sec2a) =2;

f(4)(a) = 8sec2a tan a (tan2a+2sec2a) = 0; f(5)(a) = 8sec2a (2tan4a+11sec2atan2a+2sec4a) = 16;

which gives tan x = x + 2 + 16 + ...

Note: dividing sin x ≈ x - + by cos x ≈ 1 - + gives tan x ≈ x + + x5 ,

the same polynomial approximation as we obtained above.

(f) f(a) = arcsin a = 0; f'(a) = 1/= 1; f''(a) = a/(1-a2)3/2 = 0; f'''(a) = (1+2a2)/(1-a2)5/2 = 1;

giving arcsin x = x + + ...

(g) f(a)== 1; f'(a) = 1/2=1/2; f''(a) =-1/4(1+a)3/2 =-1/4; f'''(a) = 3/8(1+a)5/2 =3/8;

giving = 1 + x - + + ...

Note: the terms in the polynomial approximation give the so-called "Binomial Expansion" of (1+x)1/2.

(h) f(a) = sinh a = 0; f'(a) = cosh a = 1; f''(a) = sinh a = 0; f'''(a) = cosh a = 1; f(4)(a) = sinh a=0; f(5)(a) = cosh a =1;

so we get sinh x = x + + + ...

Note: sinh x = so the approximation can also be obtained from the Taylor polys for ex and e-x .

(i) f(a) = cosh a =1; f'(a) = sinh a = 0; f''(a) = cosh a = 1; f'''(a) = sinh a = 0; f(4)(a) = cosh a = 1; f(5)(a) = sinh a=0

so we get cosh x = 1 + + + ...

Note: Since cosh x = the approximation can also be obtained from the Taylor polys for ex and e-x .

(j) f(a) = (1+a)/(1-a) = 1; f'(a) = 2/(1-a)2 = 2; f''(a) = 4/(1-a)3 = 4; f'''(a) = 12/(1-a)4 =12; so:

= 1 + 2x + 4 + 12 +... = 1 + 2x + 2x2 \+ 2x3 \+ ...

Note that = - 1 = 2(1+x+x2+x3+...) - 1 where is the sum of the geometric series 1+x+x2+x3+...

10.

(a) The Taylor series about x = 0 is: x - + - + - .... The Ratio Test gives  |x| = |x|

hence the series converges for -1 < x < 1 and diverges for |x| > 1. At x = 1 get 1 - + + -

which converges (the alternating series test: the terms _decrease to zero_ ). At x = -1 get -1 - - - ...

the (-ve ) harmonic series ... hence diverges. Hence, interval of convergence is -1 < x ≤ 1.

(b) Put x = t-1 in series for part (a) and get _ln_ t = (t-1) - + - + ... hence the Taylor series for _ln_ x,

about x = 1, is (x-1) - + + - ... The Ratio Test gives  |x-1| = |x-1| hence series converges for |x-1|<1, or -1<x-1<1, or 0<x<2 (and diverges for |x-1|> 1, or x>1 and x<-1).

For x = 0 get -1 - - - ... which diverges. For x = 2 get 1 - + + - ... which converges.

Interval of convergence is 0 < x ≤ 2.

(c) The Taylor series for ex is 1+x+ + ... Replace x with 2x and get the Taylor series for e2x, namely

1 + 2x + + +... The Ratio Test gives  =0 for all x, hence interval of convergence is -∞ < x < ∞. (i.e series converges for all x.)

(d) Taylor series for sin x about x = is : sin + cos- sin2/2! - cos3/3! + ...

The Ratio Test gives  = 0 for all x hence the interval of convergence is -∞ < x < ∞.

(e) The Taylor series for is 1 + x + x2 \+ x3\+ ... (the sum of the geometric series is , or you may use Taylors formula). The Ratio Test gives  | | = |x| hence the series converges for -1 < x < 1 which is the interval of convergence (clearly the series diverges for x = ± 1).

(f) The Taylor series for sinh x = is

or x + + + ... and the Ratio Test gives  = 0 for all x hence interval is -∞ < x < ∞.

11. = a0 \+ a1 x + a2 x2 \+ ... gives

a1 \+ 2a2 x + 3a3 x2 \+ 4a4 x3 \+ ... = a0 \+ a1 x + a2 x2 \+ a3 x3 \+ ...

hence a1 = a0 and 2a2 = a1 so a2 = and 3a3 = a2 so a3 = = and 4a4 = a3 so a4 = etc. etc.

The power series a0 \+ a1 x + a2 x2 \+ a3 x3 \+ ... then becomes: a0 \+ a0 x + a0 \+ a0 \+ a0 \+ ...

(namely a0 ex ... which should be no surprise!)

12. Put x = 1 into ex = 1 + x + + + ... and get e = 1 + 1 + + ++ + +... so error in using just five terms, namely 1 + 1 + + + is + + +... =

which is less than = = .01 (where we recognize the geometric series!).

13. 1+x+x2+x3+... = and differentiating each side gives

1+2x+3x2+4x3+...= and multiplying each side by x gives

x+2x2+3x3+4x4+...= and we have summed this series!

Hence <E> = = = .

14. (a) Revolving the parabola z = 1 - x2 about the z-axis generates

the paraboloid z = 1 - x2 \- y2 (replacing x2 by x2 \+ y2 ). |

---|---

(b) The cone z = is obtained by revolving z = = | x |

about the z-axis (replacing x2 by x2 \+ y2 ).

(c) z = x is a line in the x-z plane and, moved parallel to the y-axis,

becomes the plane z = x. (It can also be called a _cylinder_ ?!).

(d) The parabola y = x2, in the x-y plane, when moved parallel to

the z-axis, generates the (parabolic) cylinder y = x2.

(e) The hyperbola z2 = 1 + x2 (or z2 \- x2 = 1), when revolved about

the z-axis, yields the hyperboloid (of 2 sheets) described by

z2 = 1 + x2 \+ y2 (replacing x2 by x2 \+ y2 ).

(f) The hyperbola z2 = 1 + x2 (or z2 \- x2 = 1) , when revolved about the x-axis, yields the hyperboloid (of 1 sheet) described by

z2 = 1 + x2 \- y2 (replacing z2 by z2 \+ y2 ). |

  |

15. (a)   (b)

(c)   (d)

16. (a) = = = - at (-1,1) and = = =- at (-1,1)

hence equation of the tangent plane is: z = - - -

(b) = - -3/2 (2x) = at (-3,4) and = - -3/2 (2y) = - at (-3,4)

hence equation of the tangent plane is: z = + -

(c) = = 1 at (0,2) and = = 0 at (0,2)

hence equation of the tangent plane is: z = 0 +1(x - 0) +0(y - 2) or z = x

(d) = exy y = at (0,1) and = exy x = 0 at (0,1)

hence equation of the tangent plane is: z = _ln_ 2 + +0(y - 1) or z = _ln_ 2 + x

17. (a) = -2 e-2t sin x and = - e-2t sin x hence = k with k = 2

(b) To simplify the computation of and we use logarithmic differentiation:

Write _ln_ u = - - _ln_ t so = - so = u =

Also = so = -u x. Then = -= -= , so k = 1.

18. (a) + = - sin x sin 2y cos 3t - 4 sin x sin 2y cos 3t = - 5 sin x sin 2y cos 3t

and = - 9 sin x sin 2y cos 3t hence + = hence = so c = .

(b) + = 9 e3x + 4y + 5t \+ 16 e3x + 4y + 5t = 25 e3x + 4y + 5t

and = 25 e3x + 4y + 5t hence + = and c = 1.

19. (a) = + = 2x et \+ 2y cos t = 2(1)(1) \+ 2(0)(1) = 2 when t = 0

(b) = + = 2x (-sin t) + 2y (cos t) = 2(-1)(0) + 2(0)(-1) = 0 when t = π

(c) = + = (2u+v) + u (2z) = (2.0+1)(1) + (0)(2.1) = 1 when z = 1

(d) = + = (Arctan v)(cos x) + sec2x = ( ) () + 2 = +

(e) = + + = 2x cos t + 2y (-sin t) + 2z (1) = 2(1)(0) + 2(0)(-1) + 2 = π

20. = + = (-2x)(2) + (-4y)(2t) = (-2(7))(2) + (-4(4))(2(2)) = - 28 - 64 = -92 when t = 2

21. (a) gives 2x + 2y= 0 so (solving) = - and - = - = - as well.

(b) gives 2xy + x2\+ cos(xy) (x + y) - 1 = 0 so = - = -

(c) gives - 1 - = 0 so (solving for ) :

= - = - as well.

(d) gives ey \+ x ey \+ = 0 so = \- = - too.

22. (a) arctan = = = - and arctan = = = ... at (1,1)

so (i)  f = [- , ] at (1,1)

and (ii) the tangent lplane to z = arctan is then z = Arctan + + = - .

(b) _ln_ (x2+y2) = 2x = 0 and _ln_ (x2+y2) = 2y = 2 ... at (0,1)

so (i)  f = [0,2] at (0,1)

and (ii) the tangent plane is z = _ln_ 2 + (0)(x-0) + 2(y-1) = _ln_ 2 +2y - 2.

(c) f(x,y) = sec x tan x = 0 and f(x,y) = sec y tan y = 0 ... at (0,0)

so (i)  f = [0,0] at (0,0)

and the tangent plane is z = sec 0 tan 0 + 0(x-0) + 0(y-0) = 0 (i.e. the tangent plane is the x-y plane).

(d) f(x,y) = 2 and f(x,y) = 3 ... at (1,1)

so (i)  f = [2,3] at (1,1)

and (ii) the tangent plane is z = 2(1)+3(1) + 2(x-1) \+ 3(y-1) = 2x + 3y

(i.e the tangent plane at any point on the plane z = 2x + 3y is this plane itself.)

23.  f = [, ] = [ 4x3y5 , 5x4 y4] = [4,5] at the point (1,1). The directional derivative (or rate of change) in the direction  is then 4 cos  \+ 5 sin  .

(a) for  = 0, get 4 (b) for π, get - 4 (c) for  = , get (d) for  = , get

24.  f = [ f1(a,b), f2(a,b) ] = [1,] so the directional derivative in the -direction is

f1(a,b) cos  \+ f1(a,b) sin  = cos  \+ sin  = [1,] • [cos ,sin ] ... the DOT product between

the vector  f = [1,] and the unit vector [cos ,sin ]. (Note: we use the notation f1 to indicate , etc.)

(a) This DOT product is 0 if [cos ,sin ] is perpendicular to  f = [1,].

[1,] is in the direction , so  must be in a direction + = π (+ any integer multiple of π).

(b) The DOT product is as small as possible if [cos ,sin ] is opposite to the direction of  f .

Then  must be in a direction + π = π (Note: integer multiples of 2π may be added to this angle.)

(c) The DOT product is a large as possible if [cos ,sin ] is in the direction of  f .

Then  must be in a direction

* An American scientist, W.F. Libby, won the 1960 Nobel prize for his discovery of carbon dating. Cosmic radiation converts nitrogen in the atmosphere to carbon 14 , plant and animal tissue absorb this radioactive carbon (maintaining a fairly constant ratio of carbon 14 to normal carbon 12), then the tissue dies and the carbon 14 begins its radioactive decay.
* This sequence was first studied by the Italian mathematician, Leonardo of Pisa (also known as Fibonacci ... one of the most brilliant pre-Renaissance mathematicians) around 1200 A.D.
