This site is devoted to mathematics and its applications. Created and run by Peter Saveliev.

# Several variables

### From Mathematics Is A Science

## Contents

## 1 A ball is thrown...

Let's review what we learned in Chapter 7 about motion of a ball (or a cannonball).

When a ball is thrown in the air under an angle, it moves in both vertical and horizontal directions, simultaneously and independently. The dynamics is very different. In the *horizontal* direction, as there is no force changing the velocity, the latter remains constant:

Meanwhile, the *vertical* velocity is constantly changed by the gravity. The dependence of the height on the time is quadratic:

We have *three* variables:

- $t$ is time;
- $x$ is the horizontal dimension, the depth, that depends on time;
- $y$ is the vertical dimension, the height, that also depends on time.

The *path* of the ball will appear to an observer -- from the right angle -- as a curve. It is placed in the $xy$-plane positioned vertically:

First, the *numerical approach*. The solution uses neither calculus (limits) nor algebra (solving equations).

We used these difference quotients formulas to find velocity and then the acceleration from the location: $$\begin{array}{l|l|l} &\text{ horizontal }&\text{ vertical }\\ \hline \text{ position }&x_n &y_n\\ \text{ velocity }&v_n=\frac{x_{n+1}-x_n}{h} &u_n=\frac{y_{n+1}-y_n}{h}\\ \text{ acceleration }&a_n=\frac{v_{n+1}-v_n}{h} &b_n=\frac{u_{n+1}-u_n}{h}\\ \end{array}$$ where $h$ is the increment of time.

These formulas can now be solved in order to be able to model location as a function of time via these *recursive* formulas:
$$\begin{array}{l|l|l}
&\text{ horizontal }&\text{ vertical }\\
\hline
\text{ acceleration }&a_n &b_n\\
\text{ velocity }&v_{n+1}=v_n+ha_n&u_{n+1}=u_n+hb_n\\
\text{ position }&x_{n+1}=x_n+hv_n&y_{n+1}=y_n+hu_n\\
\end{array}$$
These steps amount to *discrete integration*.

*Problem:* From a $200$ feet elevation, a cannon is fired horizontally at $200$ feet per second. How far will the cannonball go?

The physics is as follows:

- horizontal: there is no force, hence $x' '=0$;
- vertical: the force is constant and $y' '=-g$,

where $g$ is the gravitational constant: $$g=32 \text{ ft/sec}^2.$$

Next, we acquire the initial conditions:

- the initial location is: $x_0=0$ and $y_0=200$;
- the initial velocity is: $v_0=200$ and $u_0=0$.

We use the formulas to evaluate the location every $.1$ second. This is what the path looks like:

**Example (how far).** To find when and where the ball hits the ground, we scroll down to find the row with $y$ close to $0$.

It happens sometime between $t=3.5$ and $t=3.6$ seconds, say $t_1=3.55$ seconds. Second, the values of $x$ at the time is between $x=700$ and $x=720$ feet, say, $x_1=710$ feet. We also plot the graphs of $x$ and $y$ as functions of $t$ on the right.

The spreadsheet is constructed for $x$ and $y$ separately, as follows. The time is in the first column progressing form $0$ every $.05$. The second derivative is in the next, $0$ and $-32$ respectively. In the next column, the initial velocity is entered in the top cell, $200$ and $0$ respectively. Below, the velocity is computed as a Riemann sum function of the previous column, with the same formula: $$\texttt{=R[-1]C+(RC[-2]-R[-1]C[-2])*R[-1]C[-1]}$$ In the next column, the initial location is entered in the top cell, $0$ and $200$ respectively. Below, the location is computed as a Riemann sum function of the previous column, with the same formula: $$\texttt{=R[-1]C+(RC[-3]-R[-1]C[-3])*RC[-1]}$$

The results are shown below:

To find the solution to the problem from this data, we find the interval during which the cannonball hit the ground, i.e., $y=0$. We go down the $y$ column until we find the value closest to $0$; it is $y=1.2$. We then find the corresponding value of $x$; it is $x=700$. $\square$

**Exercise.** Under the same conditions, solve numerically the problem of hitting a target $500$ feet away.

We start with the *continuous case* now:

- horizontal: $x' '=0$;
- vertical: $y' '=-g$.

We start at the same place as above: $$\begin{cases} x' '&=0,&x'(0)=200, &x(0)=0;\\ y' '&=-g&y'(0)=0, &y(0)=200. \end{cases}$$

Since the velocity is an antiderivative of the acceleration, we integrate these. Then for horizontal, we have: $$x'=\int 0\, dt=C_x,$$ where $C_x$ is any constant. Next, for the vertical, $$y'=\int -g\, dt=-gt+C_y,$$ where $C_y$ is any constant.

Since the location is an antiderivative of the velocity, we integrate these. Then for horizontal, we have: $$x=\int x'\, dt=\int C_x\, dt=C_xt+K_x,$$ where $K_x$ is any constant. Next, for the vertical, $$y=\int y'\, dx=\int (-gt+C_y)\, dt=-\tfrac{1}{2}gt^2+C_yt+K_y,$$ where $K_y$ is any constant.

Thus, the *general solution* of this system of differential equations is:
$$\begin{cases}
x&=&&&C_xt&+&K_x,\\
y&=&-\tfrac{1}{2}gt^2&+&C_yt&+&K_y.
\end{cases}$$
Any possible dynamics is found by specifying the values of the four constants:
$$C_x,\ C_y,\ ,K_x,\ K_y.$$

The physics of the situation allows us to assign meanings to these four constants. First, $$\begin{cases} x'&=&&&C_x& \Longrightarrow & x'(0)=C_x,\\ y'&=&-gt&+&C_y& \Longrightarrow & y'(0)=C_y. \end{cases}$$ Therefore,

- $C_x$ is the (constant) horizontal component of velocity;
- $C_y$ is the initial vertical component of velocity.

Next, $$\begin{cases} x&=&&&C_xt&+&K_x& \Longrightarrow & x(0)=K_x,\\ y&=&-\tfrac{1}{2}gt^2&+&C_yt&+&K_y& \Longrightarrow & y(0)=K_y. \end{cases}$$ Therefore,

- $K_x$ is the initial horizontal location (depth);
- $K_y$ is the initial vertical location (height).

Thus, we have: $$\begin{cases} \text{depth }&=&\text{ initial depth }&+&\text{ initial horizontal velocity }&\cdot\text{ time },\\ \text{height}&=&\text{ initial height }&+&\text{ initial vertical velocity }&\cdot\text{ time }&-\tfrac{1}{2}g\cdot\text{ time }^2. \end{cases}$$

We used these two equations to solve a variety of problems about motion.

**Example (how far).** From a $200$ feet elevation, a cannon is fired horizontally at $200$ feet per second. How far will the cannonball go?

The initial conditions:

- the initial location is: $0$ and $200$;
- the initial velocity is: $200$ and $0$.

Then our equations become: $$\begin{cases} x&=&200t,\\ y&=200&&-16t^2. \end{cases}$$

Previously we solved the problem *algebraically* as follows. The height at the end of the flight is $y_1=0$, so to find the time, we set $y=200-16t^2=0$ and solve for $t$:
$$t_1=\sqrt{\frac{200}{16}}\approx 3.54.$$
We substitute this value of $t$ into $x$ to find the corresponding depth:
$$x_1=200t_1=200\frac{5\sqrt{2}}{2}\approx 707.$$
$\square$

What about the *velocity* as a function of time? We have:
$$\begin{cases}
\frac{dx}{dt}&=v_x,\\
\frac{dy}{dt}&=v_y&-gt.
\end{cases}$$
Adding these two equations to the former two allows us to solve more profound problems.

**Example (impact).** In the setting of the last example, how *hard* does the ball hit the ground?

First, we examine the spreadsheet. Instead of the formulas, we compute the average velocities (i.e., the difference quotients) to approximate the velocities. The formula for $x'$ is: $$\texttt{=(RC[-2]-R[-1]C[-2])/(RC[-3]-R[-1]C[-3]),}$$ and the formula for $y'$ is: $$\texttt{=(RC[-2]-R[-1]C[-2])/(RC[-4]-R[-1]C[-4]).}$$ The denominators refer to the column that contains the time and the numerator refers to the columns that contain $x$ and $y$ respectively.

Looking at the same row as before, we see that the vertical velocity at the moment of impact is between $-110.4$ and $-113.6$ feet per second.

Now, the algebra. The formulas for the velocities take this form: $$\begin{cases} \frac{dx}{dt}&=200,\\ \frac{dy}{dt}&=&-32t. \end{cases}$$ Let's find the velocity at the time of contact. We substitute the time we've found, $$t_1=\frac{5\sqrt{2}}{2},$$ into the formulas for velocity: $$\begin{cases} \frac{dx}{dt}\Big|_{t=t_1}&=200,\\ \frac{dy}{dt}\Big|_{t=t_1}&=&-32t_1=-32\frac{5\sqrt{2}}{2}\approx -112. \end{cases}$$ The answer matches our estimate.

But which one of the two numbers represent how *fast* the ball hits the ground? It is the latter if the ball hits the (horizontal) surface and it is the former if this is a wall. Then, the general answer should be a combination of the two. This is how they should be combined via the *Pythagorean Theorem*:

Then, the impact is determined by this number: $$\sqrt{200^2+(-112)^2}\approx 229.$$ $\square$

## 2 Introduction to parametric curves

**Example (ball).** This is how we understand the trajectory of a thrown as a *parametric curve*. There are two observers:

- one is behind the throw and can see only the rise and fall of the ball and
- the other is on the ground under the path and can only see the forward progress of the ball.

If the two make records where the ball was at what time, they can combine that information to plot point on the $xy$-plane. These point will form the ball's trajectory. $\square$

**Definition.** A *parametric curve* is a combination of two functions of the same variable:
$$\begin{cases}
x=f(t),\\
y=g(t).
\end{cases}$$

**Example (plotter).** A curve may be plotted on a piece of paper by a computer. A pen is attached to a runner on a vertical bar, while that bar slides along a horizontal rail at the bottom edge of the paper:

The computer commands the next location of both: for each moment of time $t$, the horizontal location of the bar (and the pen) is given by $x=f(t)$ and the vertical location of the pen is given by $y=g(t)$. $\square$

However, this view of parametric curves is most useful within the framework of *multidimensional spaces and vectors*. This theory is developed starting in this chapter. Of course, the motion metaphor -- $x$ and $y$ are coordinates in the space -- will be superseded. In contrast to this approach, we look at the two quantities and two functions that might have *nothing* to do with each other (except for $t$ of course). As a result, we don't have to worry about the coordinate plane having the same units for the axes in order to keep the curve proportional. Initially, there is no curve!

**Example (prices).** Suppose

- $t$ is time,
- $x$ is the price of wheat (say, in dollars per bushel), and
- $y$ is the price of sugar (say, in dollars per ton).

We simply have two functions we -- initially -- look at separately.

First, let's imagine that the price of wheat is decreasing: $$x\ \searrow\ .$$ To make this specific, we can choose: $$x=f(t)=\frac{1}{t+1}.$$ We then can study this function within the confines of calculus of single variable. To see some actual data, we evaluate $x$ for several values of $t$: $$\begin{array}{l|ll} t&x\\ \hline 0&1.00\\ 1&.50\\ 2&.33 \end{array}$$ With more points acquired in a spreadsheet we can plot the graph on the $tx$-plane:

At this point, we could, as we have in the past, proceed to use calculus to study the derivatives, the slopes, the extreme points, etc. of this function...

Second, suppose that the price of sugar is increasing and then decreasing: $$y\ \nearrow\ \searrow\ .$$ To make this specific, we can choose (an upside parabola): $$y=g(t)=-(t-1)^2+2.$$ We then again evaluate $y$ for several values of $t$: $$\begin{array}{l|ll} t&y\\ \hline 0&1.00\\ 1&2.00\\ 2&1.00 \end{array}$$ With more points acquired in a spreadsheet we plot the graph on the $ty$-plane:

We are interested in finding hidden relations between these two commodities... Before we develop calculus of parametric curves -- to study the slopes of the curves, the tangents, the turning points, etc. -- we would like to simply *visualize* them. How do we combine the two plots?

As the two plots are made of (initially) disconnected points -- $(t,x)$ and $(t,y)$ -- so is the new plot. What are those points? The points are $(x,y)$s with $x$ and $y$ appropriately paired up. A value of $x$ is paired up with a value of $y$ when they appear along the same $t$ in both plots: $$\begin{array}{l|l|l} t&x&y\\ \hline 0&1.00&1.00\\ 1&.50&2.00\\ 2&.33&1.00 \end{array}$$ This is what happens to each pair:

As a result, with the independent variables are the same, for both functions, only the dependent variables appear. Instead of plotting all points $(t,x,y)$, which belong to the $3$-dimensional space, we just plot $(x,y)$ on the $xy$-plane -- for each $t$.

The direction matters! Since $t$ is missing, we have to make sure we know in which direction we are moving and indicate that with an arrow. Ideally, we also label the points in order to indicate not only “where” but also “when”.

Thus, this is motion, just as before, but through what space? An abstract *space of prices* that we've made up. The space is comprised of all possible combinations of prices, i.e., a point $(x,y)$ stands for a combination of two prices: $x$ for wheat and $y$ for sugar.

How much information about the dynamics of the two prices contained in the original functions can we recover from the new graph? A lot. We can shrink the graph vertically to de-emphasize the change of $y$ and to reveal the *qualitative* behavior of $x$, and vice versa:

We see the decrease of $x$ and then the increase followed by the decrease of $y$. In addition, the density of the points indicates the speed of the motion. $\square$

Thus the monotonicity of $x$ and the monotonicity of $y$ determine the direction of the parametric curve. We summarize this observation below: $$\begin{array}{l|lll} y\backslash x & \nearrow & \searrow\\ \hline \nearrow & \nearrow & \nwarrow\\ \searrow & \searrow & \swarrow\\ \end{array}$$

**Example (abstract).** We can do this in a fully abstract setting. When two functions, $f,g$, are represented by their respective lists of values (instead of formulas), they are easily combined into a parametric curve, $F$. We just need to eliminate the repeated column of inputs. Suppose we need to combine these two functions:
$$\begin{array}{c|cc}
t&x=f(t)\\
\hline
0&1\\
1&2\\
2&3\\
3&0\\
4&1
\end{array}\quad \& \quad
\begin{array}{c|cc}
t&y=g(t)\\
\hline
0&5\\
1&-1\\
2&2\\
3&3\\
4&0
\end{array}\quad=\quad?$$
We repeat the inputs column -- only once -- and then repeat the outputs of either function. First row:
$$f:0\mapsto 1\quad \&\quad g:0\mapsto 5\quad \Longrightarrow\ F:0\mapsto (0,5).$$
Second row:
$$f:1\mapsto 2\quad \&\quad g:1\mapsto -1\quad \Longrightarrow\ F:1\mapsto (2,-1).$$
And so on. This is the whole solution:
$$
\begin{array}{c|cc}
t&x=f(t)\\
\hline
0&1\\
1&2\\
2&3\\
3&0\\
4&1
\end{array}\quad \& \quad
\begin{array}{c|cc}
t&y=g(t)\\
\hline
0&5\\
1&-1\\
2&2\\
3&3\\
4&0
\end{array}\quad \Longrightarrow\quad
\begin{array}{c|rlcr}
t&P=&(f(t)&,&g(t))\\
\hline
0&&(1&,&5)\\
1&&(2&,&-1)\\
2&&(3&,&2)\\
3&&(0&,&3)\\
4&&(1&,&0)
\end{array}$$
As you can see, there are no algebraic operations carried out and there is no new data, just the old data arranged in a new way. However, it is becoming clear that the list is also a function of some kind... $\square$

**Example (spreadsheet).** This is a summary how the parametric curve is formed from two functions provided with a spreadsheet. The three columns -- $t$, $x$, and $y$ -- are copied and then the last two are used to create a chart:

This chart is the *path* -- not the graph -- of the parametric curve. Note also that the curve isn't the graph of any function of one variable as the *Vertical Line Test* is violated. $\square$

**Example (pattern).** Plotting a parametric curve may reveal a relation between two quantities:

$\square$

*Parametric curves are functions*!

This idea comes with certain obligations (Chapter 2). First, we have to *name* it, say $F$. Second, as we combine two functions, we use the following **notation** for this operation:
$$F=(f,g):\ \begin{cases}
x=f(t),\\
y=g(t).
\end{cases}$$

Next, what is the *independent variable*? It is $t$. After all, this is the input of both of the functions involved. What is the *dependent variable*? It is the “combination” of the outputs of the two functions, i.e., $x$ and $y$. We know how to combine these; we form a pair, $P=(x,y)$. This $P$ is a point on the $xy$-plane!

To summarize, we do what we have done many times before (addition, multiplication, etc.) -- we create a new function from two old functions. We represent a function $f$ diagrammatically as a *black box* that processes the input and produces the output:
$$\newcommand{\ra}[1]{\!\!\!\!\!\xrightarrow{\quad#1\quad}\!\!\!\!\!}
\newcommand{\da}[1]{\left\downarrow{\scriptstyle#1}\vphantom{\displaystyle\int_0^1}\right.}
%
\begin{array}{ccc}
\text{input} & & \text{function} & & \text{output} \\
t & \mapsto & \begin{array}{|c|}\hline\quad f \quad \\ \hline\end{array} & \mapsto & x
\end{array}$$
Now, what if we have another function $g$:
$$\newcommand{\ra}[1]{\!\!\!\!\!\xrightarrow{\quad#1\quad}\!\!\!\!\!}
\newcommand{\da}[1]{\left\downarrow{\scriptstyle#1}\vphantom{\displaystyle\int_0^1}\right.}
%
\begin{array}{ccc}
\text{input} & & \text{function} & & \text{output} \\
t & \mapsto & \begin{array}{|c|}\hline\quad g \quad \\ \hline\end{array} & \mapsto & y
\end{array}$$
How do we represent $F=(f,g)$? To represent it as a single function, we need to “wire” their diagrams together side by side:
$$\newcommand{\ra}[1]{\!\!\!\!\!\xrightarrow{\quad#1\quad}\!\!\!\!\!}
\newcommand{\da}[1]{\left\downarrow{\scriptstyle#1}\vphantom{\displaystyle\int_0^1}\right.}
%
\begin{array}{ccc}
t & \mapsto & \begin{array}{|c|}\hline\quad f \quad \\ \hline\end{array} & \mapsto & x& \\
||&&&&\updownarrow\\
t & \mapsto & \begin{array}{|c|}\hline\quad g \quad \\ \hline\end{array}
& \mapsto & y
\end{array}$$
It is possible because the input of $f$ is the same as the input of $g$. For the outputs, we can combine them even when they are of different nature. Then we have a diagram of a new function:
$$\begin{array}{ccc}
(f,g):& t & \mapsto & \begin{array}{|c|}\hline
&t&
\begin{array}{lllll}
\nearrow &t &\mapsto &\begin{array}{|c|}\hline\quad f \quad \\ \hline\end{array} & \mapsto & x & \searrow\\
\\
\searrow &t &\mapsto &\begin{array}{|c|}\hline\quad g \quad \\ \hline\end{array} &
\mapsto & y & \nearrow\\
\end{array}& (x,y)\\ \hline\end{array}
& \mapsto & P
\end{array}$$
We see how the input variable $t$ is copied into the two functions, processed by them *in parallel*, and finally the two outputs are combined together to produce a single output. The result can be seen as again black box:
$$\begin{array}{ccc}
& t & \mapsto & \begin{array}{|c|}\hline \quad F \quad \\ \hline\end{array}
& \mapsto & P
\end{array}$$
The difference from all the functions we have seen until now is the nature of the output.

Next, what is the *domain* of $F=(f,g)$? It is supposed to be a recording of all possible inputs, i.e., all the $t$'s for which the output $P=(f(t),g(t))$ of the function makes sense. For this point to make sense, both of its coordinates have make sense. Then, we can choose the domain of $F$ to be the intersection of the domains of $f$ and $g$.

**Example (domain).** What is the domain of this parametric curve below?
$$F=(f,g):\ \begin{cases}
x=\sqrt{t},\\
y=\frac{1}{t-1}.
\end{cases}$$
The domain of $f$ is $t\ge 0$ and the domain of $g$ is $t\ne 1$. Therefore, the domain of $f$ is
$$[0,+\infty) \cap \bigg( (-\infty,0)\cup (0,+\infty)\bigg) =[0,1)\cup (1,+\infty).$$
$\square$

What about the *range* of $F=(f,g)$? It is supposed to be a recording of all possible outputs of $F$. The terminology used is different though.

**Definition.** The *path* of a parametric curve $x=f(t),\ y=g(t)$ is the set of all such points $P=(f(t),g(t))$ on the $xy$-plane. It is also called the *image* of $F$.

The path is typically a curve. We plot several of them below.

**Example (path).** In general, the two processes, $x=x(t)$ and $y=y(t)$, are independent. When we combine them to see the path of the object by plotting $(x,y)$ for each $t$, the result may be unpredictable:

$\square$

What about the *graph* of $F=(f,g)$? It is supposed to be a recording of all possible combinations of inputs and outputs of $F$.

**Definition.** The *graph* of a parametric curve $x=f(t),\ y=g(t)$ is the set of all such points $(t,x,y)=(t,f(t),g(t))$ in the $txy$-space.

The graph is build it from:

- the graph of $x=f(t)$ on the $tx$-plane (the floor), and
- the graph of $y=g(t)$ on the $ty$-plane (the wall facing us).

It is a curve in space, akin to a piece of wire:

Then the shadow of this wire on the floor is the graph $x=f(t)$ (light from above). If the light is behind us, the shadow on the wall in front is the graph $y=g(t)$. In addition, pointing a flashlight from right to left will produce the path of the parametric curve on the $xy$-plane.

For parametric curves, we *plot paths instead of graphs*. The simple reason is that we can't plot in 3D by hand. The drawback is a loss of information: the plot of the path tells us *where* but not *when* (without the labeling).

So, we moved from a pair of quantities represented by functions to form a parametric curve and then to its path as a way to visualize the relation between two variables. In reverse, a curve -- such a road -- that needs to be studied will benefit from being represented as the path of a parametric curve.

**Definition.** We say that we *parametrize a curve* $C$ if we find a parametric curve $x=f(t),\ y=g(t)$ the path of which is this curve $C$.

The idea is to think of $C$ as if it is a road being driven over by someone who is recording the coordinates of his locations over time. Thus, we get $x$ and $y$ from our GPS and $t$ from our clock. Of course, this can be done in a number of ways: same road, different drivers (speed, direction, etc.). Therefore, there may be different parametric curves with the same path...

**Example (circle).** Let's parametrize the circle! All we need to know comes from *trigonometry*: the $x$-coordinate of the point on the unit circle at angle $t$ is $\cos t$ and its $y$-coordinate is $\sin t$.

So, the parametric curve of the unit circle centered at the origin is given by: $$x=\cos t,\ y=\sin t.$$ No surprise; after all, this was the definition of $\sin$ and $\cos$ (Chapter 4)!

This is the circle of radius $R$:
$$x=R\cos t,\ y=R\sin t.$$
Of course, we recognize the *trig substitution* we utilized in Chapter 12 to compute the area and the length of the circle; this wasn't a coincidence!

We can also move in the backward direction: $$x=\cos (-t),\ y=\sin (-t),$$ or go twice as fast: $$x=\cos 2t,\ y=\sin 2t,$$ and so on. $\square$

This is the summary of the terminology: $$\begin{array}{l|ll} \text{types of functions:}&\text{general functions}&\text{numerical functions}&\text{parametric curves}&\text{motion}\\ \hline \text{the set of all outputs:}&\text{image}&\text{range}&\text{path}&\text{trajectory}\\ \end{array}$$

## 3 Introduction to functions of several variables

We saw some functions of several variables in Chapters 10 and 14. Let's review the main ideas.

**Example.** We will take the example from the last section and modify it. We will remove time $t$ from consideration, retain the two variables representing these two *commodities*:

- $x$ is the price of wheat,
- $y$ is the price of sugar, and

add a *product*:

- $z$ is the price of a loaf of bread.

What is the relation between these three? As those two a two major ingredients in bread, we will assume that

- $z$ depends on $x$ and $y$.

One can imagine a baker who every morning, upon receiving the updated prices of wheat and sugar, uses a *table* that he made up in advance to decide on the price of his bread for the rest of the day. Let's see how such a table might come about.

What kind of dependencies are these? Increasing prices of the ingredients in creases the cost and ultimately the price of the product: $$x\nearrow\ \Longrightarrow\ z\nearrow;\quad y\nearrow\ \Longrightarrow\ z\nearrow.$$ At its simplest, such an increase is linear. In addition to some fixed costs,

- each increase of $x$ leads to a proportional increase of $z$, and
- each increase of $y$ leads to a proportional increase of $z$,

independently! A simple formula that captures this dependence may be this: $$z=p(x,y)=2x+y+1.$$ In order to visualize this function, we compute a few of its values:

- $p(0,0)=1$,
- $p(0,1)=2$,
- $p(0,2)=3$,
- $p(1,0)=3$,
- $p(1,1)=4$,
- etc.

Even though this is a list, we realize that the input variables don't fit into a list comfortably... they form a *table*!
$$\begin{array}{cccc}
(0,0)&(0,1)&(0,2)&...\\
(1,0)&(1,1)&(2,2)&...\\
(2,0)&(1,1)&(2,2)&...\\
...&...&...&...
\end{array}$$
In fact, we can align these pairs with $x$ in each column and $y$ in each row:
$$\begin{array}{l|cccc}
y\backslash x&0&1&2&...\\
\hline
0&(0,0)&(0,1)&(0,2)&...\\
1&(1,0)&(1,1)&(2,2)&...\\
2&(2,0)&(1,1)&(2,2)&...\\
...&...&...&...&...
\end{array}$$
Now, the values, $z=p(x,y)$:
$$\begin{array}{l|cccc}
y\backslash x&0&1&2&...\\
\hline
0&1&3&5&...\\
1&2&4&6&...\\
2&3&5&7&...\\
...&...&...&...&...
\end{array}$$
That's what baker's table might look like...

We visualize the values by building columns with appropriate heights:

Notice that by fixing one of the variables -- $x=0,1,2$ or $y=0,1,2$ -- we create a function of *one* variable with respect to the other variable. We fix $x$ below and extract the columns from the table:
$$\begin{array}{l|cccc}
y\ (x=0)&z&&\\
\hline
0&1\\
1&2\\
2&3\\
\end{array}\quad
\begin{array}{l|cccc}
y\ (x=1)&z&&\\
\hline
0&3\\
1&4\\
2&5\\
\end{array}\quad
\begin{array}{l|cccc}
y\ (x=2)&z&&\\
\hline
0&5\\
1&6\\
2&7\\
\end{array}$$
A *pattern* is clear: growth by $1$. We now fix $x$ below and extract the columns from the table:
$$\begin{array}{r|cccc}
(y=0)\ x&0&1&2\\
\hline
z&1&3&5
\end{array}\quad
\begin{array}{r|cccc}
(y=1)\ x&0&1&2\\
\hline
z&2&4&6
\end{array}\quad
\begin{array}{r|cccc}
(y=0)\ x&0&1&2\\
\hline
z&3&5&7
\end{array}\quad
$$
A *pattern* is clear: growth by $2$. We have the total of six (linear) functions!

Let's do the same with a spreadsheet. This is the data:

The value in each cell is computed from the corresponding value of $x$ (all the way up) and from the corresponding value of $y$ (all the way left). This is the formula: $$\texttt{=2*R3C+RC2+1}.$$ The simplest way to visualize is by coloring the cell depending on the values (common in cartography: elevation, temperature, humidity, precipitation, population density, etc.:

The growth is visible: it grows the most in some diagonal direction but it's not $45$ degrees...

We can also visualize with bar chart, just as above:

The most common way, however, to visualize a function of two variables in mathematics is with its *graph*, a surface:

In this particular case, this is a *plane*. The second graph is the same surface but displayed as a wire-frame (or even a wire-fence). The wires are the graphs of those linear functions of one variable created from our function when we fix one variable at a time. Each of these wires comes from choosing either:

- the row of $x$s (top) and one other row in the table, or
- the column of $y$s (leftmost) and one other column in the table.

$\square$

**Exercise.** Provide a similar analysis for (a) the wind-chill and (b) the heat index.

**Example.** Below we plot $p(x,y)=\sin(xy)$.

$\square$

The functions of one variable created from our function $z=p(x,y)$ when we fix one variable at a time are: $$y=b\ \longrightarrow f_b(x)=p(x,b);\quad y=b\ \longrightarrow g_a(y)=p(a,y).$$ There are infinitely many of them. Their graphs are the slices -- along the axes -- of the surface that is the graph of $F$.

Therefore, the monotonicity of these functions tells us about the monotonicity of $p$ -- in the directions of the axes!

*Functions of two variables are functions...*

This idea comes with certain questions to be answered. What is the *independent variable*? Taking a clue from our analysis of parametric curves, we answer: it is the “combination” of the two inputs of the function, i.e., $x$ and $y$ that form a pair, $X=(x,y)$, which is is a point on the $xy$-plane. What is the *dependent variable*? It is $z$.

We represent a function $p$ diagrammatically as a *black box* that processes the input and produces the output:
$$
\newcommand{\ra}[1]{\!\!\!\!\!\xrightarrow{\quad#1\quad}\!\!\!\!\!}
\newcommand{\da}[1]{\left\downarrow{\scriptstyle#1}\vphantom{\displaystyle\int_0^1}\right.}
%
\begin{array}{ccccccccccccccc}
\text{inputs} & & \text{function} & & \text{output} \\
x\\
&\searrow\\
& & \begin{array}{|c|}\hline\quad p \quad \\ \hline\end{array} & \mapsto & z\\
&\nearrow\\
y
\end{array}$$
Instead, we would like to see a single input variable, $(x,y)$, decomposed into two $x$ and $y$ to be processed by the function *at the same time*:
$$\begin{array}{ccccccccccccccc}
& (x,y) & \mapsto & \begin{array}{|c|}\hline \quad p \quad \\ \hline\end{array}
& \mapsto & z
\end{array}$$
The difference from all the functions we have seen until now is the nature of the input.

Next, what is the *domain* of $p$? It is supposed to be a recording of all possible inputs, i.e., all pairs $(x,y)$ for which the output $z=p(x,y)$ of the function makes sense. This requirement create a subset of the $xy$-plane and, therefore, a relation between $x$ and $y$.

**Example.** What is the domain of this function:
$$p=\sqrt{x+y}?$$
Only $(x,y)$s are allowed that satisfy $x+y\ge 0$. Therefore, the domain of $p$ is a half-plane. $\square$

What about the *range* of $p$? It is supposed to be a recording of all possible outputs of $p$. The terminology will be different though.

**Definition.** The *image* of a function of two variables $z=p(x,y)$ is the set of all such values $z$ on the $z$-axis.

What about the *graph* of $p=(f,g)$? It is supposed to be a recording of all possible combinations of inputs and outputs of $F$.

**Definition.** The *graph* of a function of two variables $z=p(x,y)$ is the set of all such points $(x,y,p(x,y))$ in the $xyz$-space.

## 4 Introduction to calculus of several variables

**Example.** We will take the two examples from the last two sections and ask, what price of bread have daily visitors to the bakery shop seen over time? These are the variables:

- $t$ is time,

two variables representing these two *commodities*:

- $x$ is the price of wheat,
- $y$ is the price of sugar, and

and a *product*:

- $z$ is the price of a loaf of bread.

The visitors see how $z$ depends on $t$, via some function of *single variable*:
$$z=h(t).$$
What is it?

This is the summary:
$$\begin{array}{|ccccc|}
\hline
&\text{Example 1} & & \bigg|\\
\hline
t&\longrightarrow & (x,y) &\longrightarrow & z\\
\hline
&\bigg| & &\text{Example 2}\\
\hline
\end{array}$$
We realize that the problem is about *compositions*!

Recall that the price of wheat and the price of sugar are represented by a parametric curve:: $$x=f(t)=\frac{1}{t+1},\ y=g(t)=-(x-1)^2+2.$$ Furthermore, the price of bread is computed from the other two prices by the function of two variables: $$z=F(x,y)=2x+y+1.$$ The two functions are visualized as follows:

Then, of course, $h$ is the composition of these two: $$t\mapsto (x,y)\mapsto z,$$ computed via the following substitution: $$h(t)=F(f(t),g(t)).$$ To visualize what happens, imagine the parametric curve -- on the $xy$-plane -- being “lifted” to the graph of $F$:

The elevation is then the value of $h$. The end result is below:

$\square$

In the past, we have found the derivative of the composition of two functions by the *Chain Rule*: we expressed it in terms of the derivatives of the two functions involved (their product). We then conjecture that in order to find the derivative of the composition above we need to understand the meaning of the following:

- the derivative of a parametric curve, and
- the derivative of a function of two variables.

When a parametric curve is formed from two functions one variable: $$x=f(t),\ y=g(t),$$ are the derivatives of $f$ and $g$ visible in the shape (and the slope) of the path? Vice versa, can the derivatives be deduced from the slopes of the points of the path?

The slopes of the graphs of $f$ and $g$ produce the slope of the parametric curve according to a simple rule which is easy to discover from the case when both functions are linear:

In other words, if $m$ and $n$ and slopes of $f$ and $g$ respectively, then the slope of $(f,g)$ is $\frac{n}{m}$. Indeed, $$\frac{\text{change of }y}{\text{change of }x}=\frac{\text{change of }y / \text{change of }t}{\text{change of }x / \text{change of }t}.$$

When the functions are non-linear, the rule is the same but it is applied one point at a time: $$\text{slope at } (a,b)\ =\ \frac{g'(b)}{f'(a)}.$$ To see why, it suffices to zoom in one a point of the parametric curve as well as the corresponding points of the graphs of the two functions:

In other words, we have this:
$$\frac{dy}{dx}= \left.\frac{dy}{dt} \middle/ \frac{dx}{dt}\right. .$$
The formula resembles the *Chain Rule*, not by coincidence.

**Exercise.** Prove the formula for the case when $f$ is one-to-one.

The two special cases are the following:

- $f'(a)=0\ \Longrightarrow\ $ the slope is vertical, and
- $g'(b)=0\ \Longrightarrow\ $ the slope is horizontal.

The former case was seen as “extreme” in calculus of one variable. It's not extreme in calculus of parametric curves!

**Example.** Recall that the price of wheat and the price of sugar are represented by two functions and a parametric curve. With these functions sampled, we compute their rates of change:
$$\begin{array}{c|c|c}
t&x&x'\\
\hline
0&1.00\\
\downarrow&\downarrow\\
1&.50&\frac{.5-1}{1-0}=-.5\\
\downarrow&\downarrow\\
2&.33&\frac{.33-.5}{2-1}=-.17
\end{array}\quad\begin{array}{c|c|c}
t&y&y'\\
\hline
0&1.00\\
\downarrow&\downarrow\\
1&2.00&\frac{2.00-1.00}{1-0}=1\\
\downarrow&\downarrow\\
2&1.00&\frac{1.00-2.00}{2-1}=-1
\end{array}$$
Note that there are fewer numbers that in the original because there are fewer segments than points.

Let's now confirm this result via actual differentiation of our functions: $$\begin{cases} x&=f(t)&=\frac{1}{t+1},\\ y&=g(t)&=-(x-1)^2+2. \end{cases}\ \Longrightarrow\ \begin{cases} x'&=f'(t)&=-\frac{1}{(t+1)^2},\\ y'&=g'(t)&=-2(x-1). \end{cases}$$ We have computed the derivatives of the two component functions. Combined they also form a parametric curve!

The signs of the two new functions tell us the increasing/decreasing behavior of the two original functions and, therefore, the direction of the parametric curve. For example, $x'<0$ so that the curve moves to the left and... moves down initially because $y'<0$ and... moves up later because $y'>0$. Let's visualize and confirm these results with a spreadsheet (without using the computed derivatives above):

In order to approximate either of the two derivatives, we use the slope of the segment between two adjacent points. It is the *average rate of change* (also known as the difference quotient, the slope of the secant line, etc.):
$$\frac{\text{change of }x}{\text{change of }t} \text{ and } \frac{\text{change of }y}{\text{change of }t}.$$
The change of $t$ is fixed as $h=\Delta t$. This value for either $x$ and $y$ is computed via the same spreadsheet formula as before:
$$\texttt{=(RC[-1]-R[-1]C[-1])/R2C1}.$$
Note that there are one fewer cells in this column because there are one fewer segments than points.

In addition to $x'<0\ \Longrightarrow\ x\searrow$, we also get $x'\nearrow\ \Longrightarrow\ x\smile$ (concave up). Similarly, $y'>0\ \Longrightarrow\ x\nearrow$ and $y'\searrow\ \Longrightarrow\ y\frown$ (concave down) initially and then the opposite. Also, the apparent linearity of $y'$ indicate that $y$ might be quadratic... From the monotonicity of the component functions we conclude that initially the parametric curve goes $\nwarrow$ and then $\swarrow$ to confirm the picture. $\square$

**Exercise.** What conclusions about the shape of the parametric curve can you draw from the concavity of its component functions?

Thus, we can say about a parametric curve that *its derivative is made up of the derivatives of the functions involved*. There are only two for a parametric curve... and infinitely many for a function of two variables!

**Example.** The dependence of the price of bread on the prices of wheat and sugar are represented by the function $z=F(x,y)$ below. As it is sampled, we can compute the rates of change of this function in the horizontal and vertical directions:
$$\text{over }x:\ \begin{array}{l|cccc}
y\backslash x&0&\to&1&\to&2\\
\hline
0&1&\to&3&\to&5\\
\\
1&2&\to&4&\to&6\\
\\
2&3&\to&5&\to&7\\
\end{array}\quad\text{over }y:\ \begin{array}{l|cccc}
y\backslash x&0&\quad&1&\quad&2\\
\hline
0&1&\quad&3&\quad&5\\
\downarrow&\downarrow&\quad&\downarrow&\quad&\downarrow&\\
1&2&\quad&4&\quad&6\\
\downarrow&\downarrow&\quad&\downarrow&\quad&\downarrow&\\
2&3&\quad&5&\quad&7\\
\end{array}$$
Recall that by fixing one of the variables we create a function of one variable with respect to the other variable. Now we approximate the derivatives of these functions just as before, via the *average rate of change*:
$$\frac{\text{change of }z}{\text{change of }x} \text{ and } \frac{\text{change of }z}{\text{change of }y}.$$
First, we approximate the derivative in the direction of $y$:
$$\begin{array}{l|c|ccc}
y\ (x=0)&z&z'&\\
\hline
0\ \downarrow &1\ \downarrow &\\
1\ \downarrow &2\ \downarrow &\frac{2-1}{1-0}=1\\
2\ \downarrow &3\ \downarrow &\frac{3-2}{2-1}=1\\
\end{array}\quad
\begin{array}{l|c|ccc}
y\ (x=1)&z&z'&\\
\hline
0\ \downarrow &3\ \downarrow \\
1\ \downarrow &4\ \downarrow &\frac{4-3}{1-0}=1\\
2\ \downarrow &5\ \downarrow &\frac{5-4}{2-1}=1\\
\end{array}\quad
\begin{array}{l|c|ccc}
y\ (x=2)&z&z'&\\
\hline
0\ \downarrow &5\ \downarrow \\
1\ \downarrow &6\ \downarrow &\frac{6-5}{2-1}=1\\
2\ \downarrow &7\ \downarrow &\frac{7-6}{2-1}=1\\
\end{array}$$
All $1$s. Note that there are fewer numbers that in the original because there are fewer segments than points. Similarly, we approximate the derivative in the direction of $x$:
$$\begin{array}{r|lll}
(y=0)\ x&0&\to\ 1&\to\ 2\\
\hline
z&1&\to\ 3&\to\ 5\\
\hline
z'&&\frac{3-1}{1-0}=2&\frac{5-3}{2-1}=2
\end{array}\quad
\begin{array}{r|cccc}
(y=1)\ x&0&1&2\\
\hline
z&2&4&6\\
\hline
z'&&2&2
\end{array}\quad
\begin{array}{r|cccc}
(y=0)\ x&0&1&2\\
\hline
z&3&5&7\\
\hline
z'&&2&2
\end{array}$$
All $2$s. We put these one-variable functions together; then the rates of change of $F$ with respect to $x$ and $y$ are these new functions of two variables respectively:
$$\leadsto\quad\begin{array}{l|cccc}
y\backslash x&0&1&2\\
\hline
0&&2&2\\
1&&2&2\\
2&&2&2\\
\end{array}\quad\&\quad
\begin{array}{l|cccc}
y\backslash x&0&1&2\\
\hline
0&\\
1&1&1&1\\
2&1&1&1\\
\end{array}\quad\leadsto\quad
\begin{array}{l|cccc}
y\backslash x&0&1&2\\
\hline
0&\\
1&&(2,1)&(2,1)\\
2&&(2,1)&(2,1)\\
\end{array}$$
The two functions are further combined on right. As we shall see later, going $2$ horizontally and $1$ vertically is the direction of the fastest growth of the function $F$.

Let's now confirm this result via actual differentiation of our function:
$$F(x,y)=2x+y+1.$$
Just as before, we fix one of the variables and differentiate with respect to the other. We call these two functions *the partial derivatives of $F$ with respect to $x$ and $y$* respectively.
For $x$, we declare $y$ fixed and differentiate over $x$ using the following two types of **notation** (following Leibniz and Lagrange as before):
$$\frac{\partial F}{\partial x}=F'_x=\frac{\partial}{\partial x}\left(2x+y+1\right)=\frac{\partial}{\partial x}(2x)+0+0=2.$$
For $y$, we declare $x$ fixed and differentiate over $y$:
$$\frac{\partial F}{\partial y}=F'_y=\frac{\partial}{\partial y}\left(2x+y+1\right)=0+\frac{\partial}{\partial y}(y)+0=1.$$
The conclusion might sound familiar: the derivative of a linear function is constant! The combination of the two partial derivatives will be seen as the derivative of $F$ called the *gradient* of $F$:
$$\operatorname{grad} F=(F'_x,F'_y).$$
This is a new kind of function to be discussed later...

We can confirm these results by examining the spreadsheet. Each line (wire) below on right is the graph of a function of one variable:

And each has its own derivative! As we move horizontally, the values of $x$ grow by $.1$ while the values of $z$ grow by $.2$. Therefore, $F'_x=2$. Similarly, as we move vertically, the values of $y$ grow by $.1$ and so do the values of $z$. Therefore, $F'_y=1$. $\square$

Warning: Do not confuse partial differentiation with implicit differentiation that comes under *related rates*:
$$\frac{\partial}{\partial x}\left(xy\right)=y\quad \text{vs. }\quad \frac{d}{dx}\left(xy\right)=y+\frac{dy}{dx}.$$
The extra term on right comes from the fact that the two variables are *related*! In the former case, they aren't -- as two *independent* variable -- and, therefore, $\frac{dy}{dx}=0$. In either case, the differentiation in on the $xy$-plane going in *some* direction: in the former case it can be only vertical or horizontal and in the latter case the direction is determined by the relation between the variables (for example, if $x-y=1$, the direction is $45$ degrees).

**Example.** Let's consider this function again:
$$F(x,y)=\sin(xy).$$
Compute the partial derivatives by the *Chain Rule*. First we declare $y$ an unknown and unspecified but fixed *parameter* and carry out differentiation with respect to $x$:
$$\frac{\partial F}{\partial x}=\frac{\partial}{\partial x}\big(\sin(xy)\big)=\cos(xy)\cdot \frac{\partial}{\partial x}(xy)=\cos(xy)y.$$
This time $x$ is the parameter:
$$\frac{\partial F}{\partial y}=\frac{\partial}{\partial y}\big(\sin(xy)\big)=\cos(xy)\cdot \frac{\partial}{\partial y}(xy)=\cos(xy)x.$$
Let's confirm these results by examining the graph of $F$ plotted with a spreadsheet:

Note that the edge of the surface is a curve and it is the graph of the function given in the very last row of the table. We also notice that:

- the surface is flat along the $x$-axis, because $\frac{\partial F}{\partial x}=0$ for $y=0$, and
- the surface is flat along the $y$-axis, because $\frac{\partial F}{\partial x}=0$ for $x=0$.

We approximate these derivatives just as before, the average rate of change with the change of $x$ and $y$ is fixed as $h=\Delta x=\Delta y$. In a spreadsheet, the rate of change of $z$ in the direction of $x$ and $y$ is computed via the same formula but applied horizontally and vertically respectively: $$\texttt{=(R[-23]C-R[-23]C[-1])/R1C1 } ; \quad \texttt{=(RC[-23]-R[-1]C[-23])/R1C1 }.$$ The formulas produce the two partial derivatives:

At the point $(.1,.1)$, the values of the two partial derivatives are equal, which is why the direction of the fastest growth of $F$ is at $45$ degrees. Note also that the highest locations form a ridge; it is where both partial derivatives are equal to $0$. To find the location of the ridge we solve the equation $$\cos(xy)=0\ \Longrightarrow\ xy=\pi/2.$$ It's a hyperbola. $\square$

The idea that the values of the partial derivatives indicate the direction of the fastest growth of the function can be illustrated approximately as follows: $$\begin{array}{l|ccc} y\quad\quad\backslash x & F'_x>0 & F'_x<0\\ \hline F'_y>0 & \nearrow & \nwarrow\\ F'_y<0 & \searrow & \swarrow\\ \end{array}$$

## 5 Differential equations

A differential equation is an equation that relates values of the derivative of the function to the function's values. To solve the equation is to find all possible functions that satisfy it.

**Example.** The simplest example of a differential equation is:
$$f'(x)=0 \text{ for all } x.$$
We already know the solution:

- constant function are solutions, and, conversely,
- only constant functions are solutions.

$\square$

**Example.** We can replace $0$ with any function:
$$f'(x)=g(x) \text{ for all } x.$$
We already know the solution:

- a solution $f$ has to be an antiderivative of $g$, and, conversely,
- only antiderivatives of $g$ are solutions.

$\square$

However, let's take a look at the problem with a fresh eye.

For the first kind of differential equation, we will be looking for functions $y=y(x)$ that satisfy the equation,
$$y'(x)=g(x) \text{ for all } x.$$
What we know from the equation is the value of the derivative $y'(x)$ of $y$ at every point $x$ but we don't know the value $y(x)$ of the function itself. Then, for every $x$, we know the slope of the tangent line at $(x,y(x))$. As $y(x)$ is unknown, in order to visualize the data, we plot the same slope for every point $(x,y)$ on a given *vertical* line:

Thus, for each $x=c$, we indicate the angle $\alpha$, with $g(c)=\tan \alpha$, of the intersection of the graph of the unknown function $y=y(x)$ and the vertical line $x=c$.

Metaphorically, we are to create a fabric from two types of threads. The *vertical* ones have already been placed and the way the threads of the second type are to be weaved in has also been set.

The challenge is to devise a function that would *cross these lines at these exact angles*.

Is it always possible to have these functions? Yes, at least when $g$ is continuous. How do we find them? Anti-differentiation.

**Example.** A familiar interpretation is that of an object the velocity $v(x)=y'(x)$ of which is known at any moment of *time* $x$ and its location $y=y(x)$ is to be found.

These solutions *fill* the plane without intersections. They are just the vertically shifted versions of one of them. $\square$

For another kind of a differential equation, what if the derivative depends on the values of the function only (and not the values of the variable)?

We are looking for functions $y=y(x)$ that satisfy the equation,
$$y'(x)=h(y(x)) \text{ for all } x.$$
What we know from the equation is the value of the derivative $y'(x)$ of $y$ at every point $(x,y)$ even though we don't know the value $y(x)$ of the function itself. Then, for every $y$, we know the slope of the tangent line at $(x,y)$. As $y(x)$ is unknown, in order to visualize the data, we plot the same slope for every point $(x,y)$ on a given *horizontal* line:

Thus, for each $y=d$, we indicate the angle $\alpha$, with $h(d)=\tan \alpha$, of the intersection of the graph of the unknown function $y=y(x)$ and the vertical line $x=c$ with $y(c)=d$.

Metaphorically, we are to create a fabric from two types of threads. The *horizontal* ones have already been placed and the way the threads of the second type are to be weaved in has also been set.

Is it always possible to have these functions? Yes, at least when $h$ is differentiable. The challenge is to devise a function that would *cross these lines at these exact angles*. How this can be done numerically is discussed later.

**Example.** An interpretation is a stream of liquid with its velocity known at every *location* $y$ and we need to trace the path of a particle initially located at a specific place $y_0$.

$\square$

**Example.** This is what happens when the velocity is equal to the location:
$$y'=y.$$
To solve, what is the function equal to its derivative? It's the exponent $y=e^x$, of course. However, all of its multiples $y=Ce^x$ are also solutions:

These multiples of the exponent are shown on the left, not to be confused with the exponents of various bases (middle). Just as the differential equations of the first kind, these solutions *fill* the plane. $\square$

**Exercise.** What is the transformation of the plane that creates all these curves from one?

**Exercise.** Can the velocity be really “equal” to the location?

**Exercise.** What happens when the velocity is proportional to the location; i.e.,
$$y'=ky?$$

Compare and contrast: $$\begin{array}{lll} \text{DE: }&y'(x)=g(x)&y'(x)=h(y(x))\\ \hline \text{the slopes are the same along any }&\text{ vertical line }&\text{ horizontal line }\\ \text{the velocity is known at any }&\text{ time }&\text{ location }\\ \end{array}$$

Of course, differential equations are also common that have neither of these patterns:

Even though we can't solve those equations yet, once we have a family of curves, it is easy to find a differential equation it came from.

**Example.** What differential equation does the family of all concentric circles around $0$ satisfy?

This family is given by: $$x^2+y^2=r^2,\ \text{ real } r.$$ We simply differentiate the equation implicitly: $$x^2+y^2=r^2\ \Longrightarrow\ 2x+2yy'=0.$$ That's the equation. $\square$

**Example.** What about this family of hyperbolas?

This family is given by: $$xy=C,\ \text{ real } C.$$ Again, we differentiate the equation implicitly. We have: $$y+xy'=0.$$ $\square$

## 6 The centroid of a flat object

Suppose we have a plate with uniform density and identical thickness (it is known as a “lamina”). How can we balance it on a single support called the *centroid*?

There are a few heuristics that help. If the object has a “center”, such as a circle or a square, this is it.

Also, any axis of symmetry will have to contain the centroid.

The idea of centroid is related to the concept of the *center of mass* which is the center of rotation of the object when subjected to a force. We studied this concept previously but with the weight distributed within a straight segment, such as a seesaw:

We found that if one person is heavier than the other, the latter person should sit farther from the center in order to balance the beam. In fact, the distance should be twice as long!

Consider a variation of the *seesaw*. It is made of two beams nailed together to form a cross with a single point of support in the middle:

It appears that four persons of equal weight will be in balance when located at equal distance from the point of support. But there is more: they will be balanced as long as either *pair* of persons facing each other are in balanced! We can then use what we have learned from the $1$-dimensional case.

We explore this idea by replacing this construction with a square. Then the seesaws that we considered previously can be interpreted as this square *balanced on a bar* that goes all the way across:

We can spread the weight along the line parallel to the bar because only the distance to this bar matter for the leverage of each weight. Once we add the $x$- and $y$-axes to the picture, this distance is simply the $x$-coordinate:

Now, our problem is that of balancing the region below the graph of a function.

Let's review how we do this. Suppose we have a non-negative function $y=f(x)$ integrable on segment $[a,b]$. For a given point $c$ the integral
$$\int_a^b f(x)(x-c)\, dx$$
is called the *total moment of the region with respect to* $c$. The *center of mass of the region* is such a point $c$ that the total moment with respect to $c$ is zero:
$$c=\frac{\int_a^b f(x)x\, dx}{\int_a^b f(x)\, dx}.$$

**Example.** Let's review how we find how we can balance a triangle on its horizontal edge. Suppose it is the region under the graph $y=f(x)=x$ from $0$ to $1$:

We compute the total moment of the object: $$\begin{array}{ll} \int_0^1 f(x)x\, dx&=\int_0^1 x\cdot x\, dx\\ &=\int_0^1 x^2\, dx\\ &=x^3/3\Bigg|_0^1\\ &=1/3. \end{array}$$ Meanwhile, the mass is simply $1/2$. Therefore, the center of mass -- on the $x$-axis -- is $$c_1=\frac{1}{3}\div \frac{1}{2}=\frac{2}{3}.$$

What if we want to balance the triangle on its other edge? We place the $x$-axis along that edge, then the slanted edges is given by $y=g(x)=1-x$. We compute the total moment of the object: $$\begin{array}{ll} \int_0^1 g(x)x\, dx&=\int_0^1 (1-x)x\, dx\\ &=\int_0^1 (x-x^2)\, dx\\ &=y^2/2-y^3/3\Bigg|_0^1\\ &=1/2-1/3\\ &=1/6. \end{array}$$ Meanwhile, the mass is still $1/2$. Therefore, the center of mass -- along this edge -- is $$c_2=\frac{1}{6}\div \frac{1}{2}=\frac{1}{3}.$$

We can balance the triangle on either of two bars. Now we remove the bars and replace them with a single support placed at their intersection. $\square$

**Definition.** Suppose a function $y=f(x)$ is integrable on $[a,b]$. Then the *total moment* of the region under the graph of $f$ with respect to the line $x=c$ is defined to be:
$$\int_a^b (x-c)f(x)\, dx;$$
then such a line is an *axis* of the region if the total moment is zero.

**Example.** Let's try the half-circle. One of the axes will go through the center perpendicular to the diameter. We just need to find the other. That's why we place the quarter of the disk adjacent to the origin:

Then the total moment is: $$\int_0^1 (x-c)\sqrt{1-x^2}\, dx=0.$$ Therefore, $$\int_0^1 x\sqrt{1-x^2}\, dx=c\int_0^1 \sqrt{1-x^2}\, dx.$$ The integral in the right-hand side is simply the area of the quarter circle and the one in the left-hand side is easily evaluated by substitution ($u=1-x^2$): $$\begin{array}{lll} \int_0^1 x\sqrt{1-x^2}\, dx&=\int_1^0 -\frac{1}{2}\sqrt{u}\, du \\ &= -\frac{1}{2}\frac{2}{3}u^{3/2}\Bigg|_1^0 \\ &=\frac{1}{3}. \end{array}$$ Therefore, we have: $$\frac{1}{3}=c\pi /4\ \Longrightarrow\ c=\frac{4}{3\pi}\approx .42.$$ $\square$

**Exercise.** Find the axes of the region bounded by the graph $y=\frac{1}{x+.5}-.5$ and the axes.

We have been able to use only this definition to find the axes of regions with symmetries. Now the general case. We will have to use the $x$- and $y$-axes available to us.

**Definition.** Suppose a function $y=f(x)$ is decreasing on $[0,A]$ and $f(0)=B>0$. Then the *centroid* of the region bounded by the graph of $f$, the $x$-axis, and the $y$-axis is a point $(c_x,c_y)$ such that the total moments of the region with respect to the lines $x=c_x$ and $y=c_y$ are zero; i.e.,
$$\int_0^A (x-c_x)f(x)\, dx=0,$$
and
$$\int_0^B (y-c_y)f^{-1}(y)\, dy=0.$$

Then, the coordinates of the centroid are: $$c_x=\frac{1}{A}\int_0^A xf(x)\, dx,$$ and $$c_y=\frac{1}{A}\int_0^B yf^{-1}(y)\, dy,$$ where $A$ is the area of the region.

**Example.** Let's find the centroid of the region bounded the graph of $y=1-x^2$. $\square$

The general case of a plate of an arbitrary shape will be addressed in Chapter 20.

## 7 Alternative coordinate systems

*Polar coordinates*:

- $x = r \cos \theta$,
- $y = r \sin \theta$,

where

- $0 \le \theta \le 2\pi$,
- $r \le 0$.

**Example (angular velocity).**

Spherical coordinates:

- $x = r \cos \theta \sin \phi$,
- $y = r \sin \theta \sin \phi$,
- $z = \cos \phi$,

where

- $0 \le \theta \le 2\pi$,
- $0 \le \phi \le \pi$,
- $r \le 0$.

## 8 Discrete forms

Some numerical simulation of motion that we have seen can be computed with tools similar to but simpler than functions defined at the nodes or the secondary nodes of a partition.

**Example.** Recall what we started with in the very beginning. Suppose the speedometer is broken and in order to estimate how fast we are driving, we look at the odometer every hour:

That's a discrete $0$-form. To find the displacement for every hour we just look at the differences:

That's a discrete $1$-form. Alternatively, the odometer is broken and we look at the speedometer to sample the velocity and then, via the Riemann sums, find the displacement. $\square$

Let's start over.

Suppose we have a partition of some interval $[a,b]$ in the $x$-axis.

There are two types of pieces:

- the
*nodes*, or $0$-cells, $x=x_k,\ k=0,1,...,n$; and - the
*edges*, or $1$-cells, $c_k=[x_{k-1},x_{k}],\ k=1,...,n$.

The increments of $x$ are $\Delta x_k=x_k-x_{k-1}$.

There are no secondary nodes this time!

**Example.** Specific representations can also be provided with a spreadsheet choosing, for example, $\Delta x=1$:

You can see how every other cell is squarely and every other is stretched horizontally to emphasize the different nature of these cells: nodes vs. edges. $\square$

In the motion interpretation, there is a number -- the location -- associated with each node (just as before) and a number -- the displacement -- associated with each edge.

**Definition.** For a given partition (of an interval or the whole real line), a *discrete form* of

- degree $0$ is a real-valued function with nodes as inputs; and
- degree $1$ is a real-valued function with edges as inputs.

We use arrows to picture these functions as correspondences:

Here we have two:

- a discrete $0$-form $f: 0\mapsto 2,\ 1\mapsto 4,\ 2\mapsto 3,\ ...$; and
- a discrete $1$-form $s: [0,1]\mapsto 3,\ [1,2]\mapsto .5,\ [2,3]\mapsto 1,\ ...$.

A more compact way to visualize is this:

We can also *list the values* of the two functions:

- a discrete $0$-form $f$ with $f(0)=2,\ f(1)=4,\ f(2)=3,\ ...$; and
- a discrete $1$-form $s$ with $s\Big([0,1] \Big)=3,\ s\Big([1,2] \Big)=.5,\ s\Big([2,3] \Big)=1,\ ...$.

Discrete functions can be represented by tables (spreadsheets):

The most common way to visualize a function is with its *graph*, which consists of points on the $xy$-plane with $y=f(x)$:

For a discrete $0$-form, $x$ is a node, a number, and $y=f(x)$ is also a number. Together, they produce $(x,y)$, a point on the $xy$-plane (with the $x$-axis split into cells as shown above). For a discrete $1$-form, $[A,B]$ is an interval in the $x$-axis, and $y=g([A,B])$ is a number. Together, they produce a collection of points on the $xy$-plane such as $(x,y)$ for every $x$ in $[A,B]$. The result is a horizontal segment.

To underscore the difference between the two, the graph of a discrete $0$-form is shown with dots and that of a discrete $1$-form with vertical bars:

Even though these functions may consist of unrelated pieces, it is possible that we can see a *continuous curve* if we zoom out:

**Example.** Let's consider an example of motion. Suppose a $0$-form $p$ gives the position of a person and suppose

- at time $n$ hours we are at the $5$ mile mark: $p(n)=5$, and then
- at time $n+1$ hours we are at the $7$ mile mark: $p(n+1)=7$. $\\$

We don't know what exactly has happened during this hour but the simplest assumption would be that we have been walking at a constant speed of $2$ miles per hour.

Now, instead of our velocity function $v$ assigning this value to each instant of time during this period, it is assigned to the *whole* interval:
$$v\Bigg|_{[n,n+1]}=2\ \text{, or better }\ v\Big( [n,n+1] \Big)=2$$
This way, the elements of the domain of the velocity function are the edges and the resulting function is a discrete $1$-form! $\square$

The functions, when defined on the nodes change abruptly and, consequently, the change over every interval $[A,B]$ is simply the *difference of values* at the nodes, from right to left:
$$f(B)-f(A).$$
The output of this simple computation is then assigned to the interval $[A,B]$:
$$[A,B] \mapsto f(B)-f(A).$$

Just as before, the difference stands for the change of the function.

**Definition.** The *difference* *of a discrete $0$-form* $f$ is a discrete $1$-form given by its values at each edge:
$$\Delta f \, (c_k)=f(x_{k})-f(x_{k-1}).$$

The relation between a $0$-form and its difference is illustrated below:

This is how a spreadsheet computes the difference of a function given by the data in the first column:

**Example.** When the discrete $0$-forms are represented by formulas, the computations are straightforward ($h=1$) with a chance of simplification:
$$\begin{array}{llllll}
(1)&f(n)=3n^2+1\ &\Longrightarrow\ &\Delta f\, (c_n)=(3n^2+1)-(3(n-1)^2+1)=6n-3;\\
(2)&g(n)=\frac{1}{n}\ &\Longrightarrow\ &\Delta g\, (c_n)= \frac{1}{n}-\frac{1}{n-1}=-\frac{1}{n(n-1)} \text{ for } n\ne 0,1;\\
(3)&p(n)=2^n\ &\Longrightarrow\ &\Delta p\, (c_n)= 2^{n}-2^{n-1}=2^{n-1}.
\end{array}$$
$\square$

**Definition.** The *sum of a discrete $1$-form* $g$ is a discrete $0$-form given by its value at each node $x_k,\ 1\le k\le n,$ of a partition of $[a,b]$ by:
$$\sum_{[a,x_k]} g=g(c_1)+g(c_2)+...+g(c_k),$$
where $c_1,c_2,...,c_n$ are the edges of the partition.

The fundamental relation is between the differences and sums.

First, we have a $0$-form and a $1$-form:

- if $f$ is defined at the nodes $x_k,\ k=0,1,2,...,n$, of the partition, then
- the difference $g$ of $f$ is defined at the edges of the partition by:

$$g(c_{k})=f(x_{k})-f(x_{k-1}).$$

**Theorem (Fundamental Theorem of Discrete Calculus I).** Suppose $f$ is a discrete $0$-form. Then, for each node $x$ of the partition, we have:
$$\sum_{[a,x]} (\Delta f) =f(x)-f(a).$$

Second, we have a $1$-form and a $0$-form:

- if $g$ is defined at the edges $c_k,\ k=1,2,...,n$, of the partition, then
- the sum $f$ of $g$ is defined recursively at the nodes of the partition by:

$$f(x_{k})=f(x_{k-1})+g(c_k).$$

**Theorem (Fundamental Theorem of Discrete Calculus II).** Suppose $g$ is a discrete $1$-form. Then, we have:
$$\Delta\left( \sum_{[a,x]} g \right)=g.$$

So, the two operations cancel each other in either order:

As we see now, the Fundamental Theorem doesn't need *quotients*.

Next, *there are no compositions of forms*. In order to create a composition $q\circ p$ of a $0$- or $1$-form $q$ with another function or form $p$, the values of $p$ have to be $0$- and $1$-cells respectively.

**Definition.** A *cell function* $y=p(x)$ is a function that assigns

- a node to each node, and
- an edge or a node to each edge,

in such a way that the end-points of each edge remain end-points: $$p\big([u,v]\big)=\big[p(u),p(v) \big].$$

The function $p$ assigns a $k$- or $k-1$-cell to each $k$-cell.

Because of the property, the values of a cell function on the edges can be reconstructed from its values on the nodes. The former is then analogous to the *difference* of the cell function.

For convenience, we assume that $\Delta$ is zero when computed over any node $x$.

**Theorem (Chain Rule).** The difference of the composition of two functions is the composition of the difference of the latter with the former; i.e., for any cell function $x=p(t)$ from $[a,b]$ to $[c,d]$ and any $0$-form $y=g(x)$ on $[c,d]$, we have the differences satisfy:
$$\Delta (g\circ p)= \Delta g\, \circ p.$$

In other words, we have for each edge $s$: $$\Delta (g\circ p)(s)= \Delta g\, (p(s)).$$

## 9 Differential forms

They are the discrete analogs of discrete forms.

Question: Is the derivative $\frac{dy}{dx}$ a fraction?

The answer that followed the definition was an emphatic No!

A more advanced answer we give here is: Yes, here's why.

Suppose we have a function $y=f(x)$ and we are to study its behavior around a point $x=a$. The derivative at $a$ is
$$\frac{dy}{dx}\bigg|_{x=a} = \text{ the slope of the tangent line through } (a,f(a)) = \frac{\text{rise}}{\text{run}}.$$
This *is* a fraction after all!

**Example.** Specifically, suppose $f(x)=x^2+2x$. At $a=0$, we have $f(0)=0$, so our interest is the point $(0,0)$. Then,
$$ \frac{dy}{dx}\bigg|_{x=0} = 2x+2\bigg|_{x=0}=2. $$
If this is a fraction, what would be the meaning of this:
$$dy = 2\cdot dx? $$

It is the equation of the tangent line written with respect to $dy$ and $dx$. $\square$

Thus, the equation
$$dy=f'(a)\cdot dx$$
refers to a specific location, $x=a$ and $y=f(a)$, on the $xy$-plane and it is a relation between the two *new variables* as the old ones have been specified.

Can we see $dx$, $dy$ on the graph?

Thus, we have:

- $dx$ is the run of the tangent line, and
- $dy$ is the rise of the tangent line.

They are called the *differentials* of $x$ and $y$ respectively.

Keep in mind that here $dx$ is just a certain variable related to $x$ (to emphasize this point, the formula can be re-written as $Y = f'(a)\cdot X$). The algebra may come from the example above:

- $y$ depends on $x$ via $y=f(x)$, and
- $dy$ depends on $x$ and $dx$ via $dy=f'(x)dx$.

**Example (linearization).** Given a function $f(x)=x^2$, find its best linear approximation at $a=1$. Since $f'(x)=2x$, we see that $f'(a) = f'(1) = 2$ and, therefore, the best linear approximation of $f$ at $a=1$ is
$$T(x)= f(a) + f'(a)(x-a)= 1 + 2(x-a).$$
Now we interpret $x-a$ as $dx$. Then, if we ignore the constant part, we can write $dy = 2dx$. The equation expresses our derivative in terms of these new variables, the differentials. We capture the relation between the increment of $x$ and that of $y$ -- *close to* $a$. Indeed, $y$ grows two times as fast as $x$. We acquire this information by introducing a new coordinate system $(dy,dx)$. In this coordinate system, the best linear approximation (given by the tangent line) becomes simply a linear function. $\square$

**Definition.** A *differential form of degree* $1$, or simply a $1$-form, is defined as a function of two variables:
$$\varphi=\varphi (x,dx)=g(x) \cdot dx,$$
where $y=g(x)$ is a function of $x$, linear with respect to the second variable.

Warning: the symbol “$\cdot$” stands for multiplication and it is often omitted.

The point of the new concept is to make a careful distinction between the location, $x$, and the direction, $dx$.

Let's plot this function:

- first we plot the graph of $g$ (green), which is the restriction of our function $\varphi$ to a fixed value of $dx$;
- then we observe that $\varphi$ is $0$ when $dx=0$ and plot those points on the $x$-axis (blue),
- finally we connect these dots to the curve with
*straight lines*(purple).

The result is this surface:

Note that the sum and the difference but not the product of quotient of two $1$-forms is also a $1$-form.

Integration of forms is understood in the same sense as before.

**Definition.** The *integral of a $1$-form* $\varphi=g\, dx$ over an interval $[a,b]$ is defined to be
$$\int_a^b\varphi=\int_a^bg(x)\, dx.$$
Then the form $g\, dx$ is *integrable* whenever $g$ is integrable.

**Definition.** A *differential form of degree* $0$, or simply a $0$-form, is any function $y=f(x)$ of $x$. Its *exterior derivative* $df$ is defined to be the $1$-form given by:
$$df=f'(x)\, dx.$$

When the name of the function that relates $x$ and $y$ is not provided, the following notation is also common: $$dy=f'(x)\, dx.$$

Thus, the exterior derivative of a function contains all information about its derivative and vice versa. However, the former provides a direct answer to this question:

- if we are at $x=a$ and make a step $dx$, what is the step $dy$ of $y$?

**Example.** Suppose $x$ is time and $y=f(x)$ is the location at time $x$. Let's re-state the above question:

- if at time $x=a$ we are at $y=f(a)$ and then we move for a short time $dx$ more, how far will we go?

The distance is velocity multiplied by time:
$$\text{displacement }=f'(a)\cdot dx,$$
but only when the velocity, $f'$, is constant! In the general case, that is just an *estimate*. $\square$

**Example.** We have used this algebra for *integration by substitution*. For the integral
$$\int x\sin x^2\, dx,$$
we introduce a new variable
$$u=x^2$$
and then compute the exterior derivative of this function:
$$du=2x\, dx.$$
Our definition of differential form treats both the integrand above and the last expression as simple cases of *multiplication* of two numbers. That is why we are at liberty to algebraically manipulate these expressions the way we have.
$\square$

In Chapter 14, the main application of differential forms is via their discrete counterpart discussed in the next section. We will also see applications of differential forms in the multidimensional case in the following chapters.

The following is a simple re-statement with our new notation of a familiar theorem.

**Theorem (Fundamental Theorem of Calculus).** Suppose $\varphi$ is a $1$-form integrable on interval $[a,b]$. Then,
$$\int_a^b\varphi = F(b)-F(a),$$
for any $0$-form $F$ that satisfies:
$$dF=\varphi.$$

In order to study a real-valued function $y=f(x)$, we now keep track of *two variables*:

- the locations, $x$ vs. $y$, and
- the directions, $dx$ vs. $dy$,

as follows: $$\begin{array}{|c|} \hline \\ \quad (x,dx)\mapsto (y,dy)=(f(x),f'(x)dx). \quad \\ \\ \hline \end{array}$$

A function can be sampled at

- the nodes, producing a discrete $0$-form, or at
- the secondary nodes, producing a discrete $1$-form.

Alternatively, we express this idea as sampling of the corresponding differential forms.

**Theorem.**

- A differential $0$-form sampled at the nodes of the partition is a discrete $0$-form.
- A differential $1$-form sampled at the intervals of the partition is a discrete $1$-form; i.e., if $\varphi$ is a differential $1$-form, the corresponding discrete $1$-form is defined by:

$$s\Big([A,B] \Big)=\int_A^B \varphi.$$

**Example (motion).** To follow the idea from the last section, the exterior derivative provides a direct answer to this question:

- Suppose $x$ is time and $y=f(x)$ is the location at time $x$. If at time $x=a$ we are at $y=f(a)$ and then we move for a short time $dx$ more, how far will we go?

The distance is velocity multiplied by time: $$\text{displacement }=f'(a)\cdot dx,$$ and this time the velocity, $f'$, is assumed to be constant throughout the interval. $\square$