This site is devoted to mathematics and its applications. Created and run by Peter Saveliev.

Matrices as functions

From Mathematics Is A Science

Jump to: navigation, search

1 Vector functions

Recall from Calculus 3: course...

Consider $u=f(x,y)=2x-3y$, a function of two variables. It's also a scalar function, because $u$ is a number $\in {\bf R}$.

Notation for such a function is:

  • $f \colon {\bf R}^2 \rightarrow {\bf R}$ , meaning
  • $f \colon (x,y) \rightarrow u$.

Another: $v=g(x,y)=x+5y$ is also a scalar function of two variables:

  • $g \colon {\bf R}^2 \rightarrow {\bf R}$ , meaning
  • $g \colon (x,y) \rightarrow v$.

Let's combine them, so that we have one function

  • $F \colon {\bf R}^2 \rightarrow {\bf R}^2$, meaning
  • $F \colon (x,y) \rightarrow (u,v)$.

Note: these are vector spaces, ${\bf R}^2$. We just combined $u$ and $v$ in one vector $(u,v)$.

Then this is the formula for $$F(x,y) = (u,v) = (2x-3y,x+5y).$$ Here $(x,y)$ is the input and $(u,v)$ is the output. Both are vectors!

That's what's behind the formula:

$$\stackrel{ {\rm vector} } {\rightarrow} F \stackrel{ {\rm vector} }{\rightarrow},$$

It's a vector function!

But the formula $F(x,y)=(2x-3y,x+5y)$ depends on $x$ and $y$, not $(x,y)$.

This is better:

$$F(x,y) = \left[ \begin{array}{} 2 & -3 \\ 1 & 5 \end{array} \right] \left[ \begin{array}{} x \\ y \end{array} \right] = AX.$$

Turns out, this is matrix product! It is called a matrix representation of this function, $F$.

This works out but note that $f$ and $g$ happen to be linear.

Does it work when the function isn't linear?

What if $F(x,y)=(e^x,\frac{1}{y})$? Try.

But, if we do have a matrix, we can always understand it as a function, example:

$$\left[ \begin{array}{} a & b \\ c & d \end{array} \right] \left[ \begin{array}{} x \\ y \end{array} \right] = \left[ \begin{array}{} ax+by \\ cx+dy \end{array} \right]=(ax+by,cx+dy),$$

for some $a,b,c,d$ fixed.

This is called linearity, in dimension $2$, if you can represent a function this way.

What about dimension $1$?

Very familiar: $f(x)=ax$ is a linear function.

Note: In calculus, $ax+b$ is "linear", not anymore; "affine" is more precise.

2 Matrix as a linear function

Compare $A$ and $AX$.

While $AX$ is a formula of the function ($X$ is the variable),

$$A(X)=AX=\left[ \begin{array}{} a & b \\ c & d \end{array} \right] \left[ \begin{array}{} x \\ y \end{array} \right] = \left[ \begin{array}{} ax+by \\ cx+dy \end{array} \right]=(ax+by,cx+dy).$$ $A$ is just a table. It's the table, however, that contains all the information about the function.

So one can think of $A$ is an abbreviation of $AX$.

Example: Consider examples in dimension $2$: $$A \colon {\bf R}^2 \rightarrow {\bf R}^2$$

Note: we use same letter for the function and the matrix.

Then $A$ is a transformation of the plane. It takes one point ($=$ vector) $(x,y)$ to another point ($=$ vector).


Points are vectors and vice versa.


Examples of linear functions in ${\bf R}^2$:

What about "abstract" vector spaces?

Consider $A \colon {\bf R}^2 \rightarrow {\bf R}^2$, as vector spaces. Elements in the first ${\bf R}^2$ are considered to be $\left[ \begin{array}{} x \\ y \end{array} \right]$ and in the second, $\left[ \begin{array}{} u \\ v \end{array} \right]$.

Now, consider the algebraic operations of these vector spaces. What happens to them under $A$?

We know:

$$A\left[ \begin{array}{} x \\ y \end{array} \right] = \left[ \begin{array}{} u \\ v \end{array} \right], A\left[ \begin{array}{} x' \\ y' \end{array} \right] = \left[ \begin{array}{} u' \\ v' \end{array} \right],$$

now what is

$A \left( \left[ \begin{array}{} x \\ y \end{array} \right] + \left[ \begin{array}{} x' \\ y' \end{array} \right] \right) = ?$

Let's recall some operations that behave well under addition of functions. In Calculus 1: course:


$$\begin{array}{} (f+g)' &= f'+g' & ({\rm sum \hspace{3pt} rule}) \\ (cf)' &= cf' & ({\rm constant \hspace{3pt} multiple \hspace{3pt} rule}) \end{array}$$

These are, in fact, the two operations of a vector space.

So, differentiation respects

  • addition,
  • scalar multiplication.

They are the algebraic operations of $C^1({\bf R})$.

There two properties taken together are called linearity of differentiation.

This example suggests what "should" happen to a vector: $$A \left( \left[ \begin{array}{} x \\ y \end{array} \right] + \left[ \begin{array}{} x' \\ y' \end{array} \right] \right) = A \left[ \begin{array}{} x \\ y \end{array} \right] + A\left[ \begin{array}{} x \\ y' \end{array} \right]$$

Let's verify:

$$\begin{array}{} LHS &= \left[ \begin{array}{} a & b \\ c & d \end{array} \right] \left( \left[ \begin{array}{} x \\ y \end{array} \right] + \left[ \begin{array}{} x' \\ y' \end{array} \right] \right) \\ &= \left[ \begin{array}{} a & b \\ c & d \end{array} \right] \left[ \begin{array}{} x + x' \\ y + y' \end{array} \right] \\ &= \left[ \begin{array}{} a(x+x')+b(y+y') \\ c(x+x')+d(y+y') \end{array} \right] \in {\bf R}^2 \end{array}$$

$$\begin{array}{} RHS &= \left[ \begin{array}{} a & b \\ c & d \end{array} \right] \left[ \begin{array}{} x \\ y \end{array} \right] + \left[ \begin{array}{} a & b \\ c & d \end{array} \right] \left[ \begin{array}{} x' \\ y' \end{array} \right] \\ &= \left[ \begin{array}{} ax+by \\ cx+dy \end{array} \right] + \left[ \begin{array}{} ax'+by' \\ cx'+dy' \end{array} \right] \\ &=\left[ \begin{array}{} ax+by+ax'+by' \\ cx + dy + cx' + dy' \end{array} \right] \end{array}$$

They are the same, after factoring.

Same property holds for all dimensions.

Next, this is the distributive property of matrix multiplication:

$$A \left( r\left[ \begin{array}{} x \\ y \end{array} \right] \right) = rA \left[ \begin{array}{} x \\ y \end{array} \right]$$

It's easy to verify for these spaces.

Now, this is the general...

Definition: A function on vector spaces that satisfy these properties is called linear: $$\begin{array}{} A(rv) &= r(Av), \\ A(u+v) &= A(u)+A(v). \end{array}$$

Once again, this is supposed to remind you of:

$$\begin{array}{} (f+g)' &= f'+g', \\ (cf)' &= cf', \end{array}$$


$$\begin{array}{} \frac{d}{dx}(f+g) &= \frac{d}{dx}f + \frac{d}{dx} g ,\\ \frac{d}{dx}(cf) &= c \frac{d}{dx}f. \end{array}$$


  • 1. If $A \in {\bf M}(n,m)$, then $A$ is a function, $A \colon {\bf R}^m \rightarrow {\bf R}^n$.


  • 2. $\frac{d}{dx} \colon C^1({\bf R}) \rightarrow ?$, (from the differentiable functions to what?)

Also, we can narrow this down to polynomials: $$\begin{array}{} \frac{d}{dx} \colon & {\bf P} \rightarrow {\bf P} \\ & {\bf P}^n \rightarrow {\bf P}^n \\ & {\bf P}^n \rightarrow {\bf P}^{n-1} \end{array}$$

As in: $\frac{d}{dx}(x^2+x)=2x+1$

  • 3. $\displaystyle\int \colon C({\bf R}) \rightarrow C({\bf R})$

This make sense because of the Sum Rule and Constant Multiple Rule and considering the fact that antiderivatives are continuous.

Problem: $f \rightarrow \displaystyle\int f + C$ many answers.

It's an "ill-defined" function. How to fix this?

Let $V = \{f \in C({\bf R}) \colon f(0)=0 \}$. Alternatively, use equivalence relations ($f'=g'$).

Exercise: Turn definite integration into a linear function.

Homework: Let $V$ be the set of absolutely convergent sequences: $V = \{ (x_1,\ldots,x_n,\ldots) \colon x_n \in {\bf R}, x_n \rightarrow a \}$.

  • 1. Define operations on $V$ that make it a vector space. Prove the axioms.
  • 2. Let $S$ be the set of all sequences of $0$'s with one $1$:

$$S = \{(1,0,0,\ldots), (0,1,0,\ldots),(0,0,1,0,\ldots), \ldots \}.$$ Is $S$ linearly independent, spanning, a basis of $V$?

  • 3. What is the dimension of $V$?

3 Linear functions on vector spaces, linear operators


Definition: Given two vector spaces $V$ and $U$. A function $A \colon V \rightarrow U$ is called a linear operator if it "respects the operations" of these spaces:

$$\begin{array}{} A(x+y) &= A(x) + A(y) \\ A(rx) &= rA(x) \end{array}$$

where $x,y \in V$ and $r \in {\bf R}$.

Note: on the left $+, \cdot$ are in $V$, on the right, $+, \cdot$ are in $U$.


Property 1: Linear operators take zero to zero: $A(0)=0$ (note: or $A(0_V)=0_U$, two different zeros!)

Proof: $A(0)=A(r \cdot 0) = rA(0)$, what if $r=0$? Then $rA(0)=0$, so $A(0)=0$.

Property 2: Linear operators take inverse to inverse: $A(-x)=-A(x)$ for all $x \in V$.

Proof: Same trick:

$$A(-x)=A((-1)x) = (-1)A(x) = -A(x)$$

Property 3: Linear operators take linear combinations to linear combinations with the same coefficients.

$$A(r_1x_1+r_2x_2+\ldots+r_nx_n) = r_1A(x_1)+r_2A(x_2)+\ldots+r_nA(x_n).$$

Proof: Use the two properties in the definition $n$ times each, then induction.

Example: What if $U=V$? Then $A \colon V \rightarrow V$ is a linear operator, a "self-function". In particular, if we choose $A(x)=x$ for all $x$, it is called the identity operator. Prove it's linear.

Notation: $Id$, $Id_v$, ${\bf 1}_V$.

Example: For any pair of spaces, we can define a linear operator $A \colon V \rightarrow U$ by $A(x)=0$ for every $x \in V$. This is the zero operator.

Theorem (Uniqueness): Suppose $S = \{v_1,\ldots,v_n\} \subset V$, vector space, $A, B \colon V \rightarrow U$ are linear operators. Suppose $S$ is a basis of $V$ and suppose the values of $A,B$ coincide on $S$: $$A(v_i) = B(v_i)$$ for all $i=1,\ldots,n$. Then $$A=B.$$

Note: (illustrate in dimension $1$) Given $f(x)=ax$, $g(x)=bx$, compare $f$ and $g$ at $x=1$. So, if $f(1)=g(1)$, then $f=g$.

Conclusion, we can't drop the condition: ${\rm span \hspace{3pt}} S = V$ from the theorem. We can drop linear independence of $S$, exercise.

Proof: Recall $A=B$ means $A(x)=B(x)$ for all $x \in V$. Pick an arbitrary $x$ and prove that.

Since $S$ spans $V$, there are $a_1,\ldots,a_n \in {\bf R}$ such that $$x = a_1v_1+\ldots+a_nv_n.$$ Then $$\begin{array}{} A(x) &= A(a_1v_1+\ldots+a_nv_n) \\ &= a_1A(v_1) + \ldots + a_nA(v_n) \\ &= a_1B(v_1) + \ldots + a_nB(v_n) \\ &= B(a_1v_1+\ldots+a_nv_n) \\ &= B(x). \end{array}$$

Above, we use Property 3 (linearity), then the theorem's assumption, and then Property 3 again. $\blacksquare$

4 Examples of linear operators

Fact: Matrices are linear operators on ${\bf R}^n$.

Once a basis is given that is, otherwise the standard basis is assumed.

Rotation through $90^o$


$$\begin{array}{} R \colon & (1,0) \rightarrow (0,1) \\ & (0,1) \rightarrow (-1,0) \end{array}$$

Turns out, we only have to look at the basis!

Claim: $R$ can be represented by a matrix made of these vectors:

$R = \left[ \begin{array}{} 0 & -1 \\ 1 & 0 \end{array} \right]$, i.e. we combine these vectors as columns.

Verify: $R(1,1) = \left[ \begin{array}{} 0 & -1 \\ 1 & 0 \end{array} \right] \left[ \begin{array}{} 1 \\ 1 \end{array} \right] = \left[ \begin{array}{} -1 \\ 1 \end{array} \right]$.


Projection $P$ on the $x$-axis: $$(x,y) \rightarrow (x,0).$$


$$\left[ \begin{array}{} ? & ? \\ ? & ? \end{array} \right] \left[ \begin{array}{} x \\ y \end{array} \right] = \left[ \begin{array}{} x \\ 0 \end{array} \right]$$

What do we put here so that it works? We can guess:

$$\left[ \begin{array}{} 1 & 0 \\ 0 & 0 \end{array} \right] \left[ \begin{array}{} x \\ y \end{array} \right] = \left[ \begin{array}{} x \\ 0 \end{array} \right]$$

Differentiation on ${\bf P}^n$, ${\rm dim \hspace{3pt}} {\bf P}^n = n+1$:

$$\frac{d}{dx}(a_nx^n+\ldots+a_1x+a_0) = na_nx^{n-1}+(n-1)a_{n-1}x^{n-2}+\ldots+a_1$$

$D \colon {\bf P}^n \rightarrow {\bf P}^n$

$$D \colon (a_0,a_1,\ldots,a_n) \rightarrow (a_1,2a_2,\ldots,(n-1)a_{n-1},na_n,0)$$

$$\left[ \begin{array}{} 0 & 1 & 0 & 0 & \ldots & 0 \\ 0 & 0 & 2 & 0 & \ldots & 0 \\ \vdots \\ 0 & 0 & 0 & 0 & \ldots & n \\ 0 & 0 & 0 & 0 & \ldots & 0 \end{array} \right] \left[ \begin{array}{} a_0 \\ a_1 \\ a_2 \\ \vdots \\ a_{n-1} \\ a_n \end{array} \right] = \left[ \begin{array}{} a_1 \\ 2a_2 \\ \vdots \\ na_n \\ 0 \end{array} \right]$$

All zeros except right above the diagonal.

If $D=\{a_{ij}\}$, write a formula for $a_{ij}=?$ Exercise.

The derivative $f'(a)$ at $a$, fixed point is a linear operator of $f \colon {\bf R} \rightarrow {\bf R}$: $f'(a)=3$, or $\frac{dy}{dx}=3$, or $dy=3dx$ (differential form).

Think of $dx$ and $dy$ as variables, then $f'(a)$ is multiplication by $3$.

So $f(a) \colon {\bf R} \rightarrow {\bf R}$ is linear. But the same applies to $grad \hspace{3pt} f(a)$, for $f \colon {\bf R}^n \rightarrow {\bf R}$ and other generalizations of the derivative.

What is this? $A \colon V \rightarrow V$, for a given $r \in {\bf R}$, define $A(x)=rx$.

What is it in ${\bf R}^2$:

$$\left[ \begin{array}{} x \\ y \end{array} \right] \stackrel{A}{\rightarrow} \left[ \begin{array}{} 2x \\ 2y \end{array} \right] = 2\left[ \begin{array}{} x \\ y \end{array} \right]$$


Radial stretch $2X$:



Its matrix: $\left[ \begin{array}{} 2 & 0 \\ 0 & 2 \end{array} \right]$.


What about $\left[ \begin{array}{} 2 & 0 \\ 0 & 3 \end{array} \right]$? It's...

Stretch, non-uniform: