This site is devoted to mathematics and its applications. Created and run by Peter Saveliev.

Diagonalization of matrices

From Mathematics Is A Science

Jump to: navigation, search

1 Example

Examples: $A = \left[ \begin{array}{} 3 & 0 \\ 0 & 2 \end{array} \right]$.

This is diagonal already. In fact, $3$ and $2$ are the eigenvalues.

That's the final output of diagonalization: two eigenvalues are lined up on the diagonal.

So, if $A$ has complex eigenvalues, it's not diagonalizable: $\left[ \begin{array}{} 0 & 1 \\ -1 & 0 \end{array} \right]$ or $\lambda = \pm i$.

Example: $A = \left[ \begin{array}{} 0 & 1 \\ 1 & 0 \end{array} \right]$. Then the characteristic polynomial is

$$\chi_A(\lambda) = \det \left[ \begin{array}{} \lambda & 1 \\ 1 & \lambda \end{array} \right] = \lambda^2-1.$$

Its roots are $\pm 1$. So, $A \sim D = \left[ \begin{array}{} 1 & 0 \\ 0 & -1 \end{array} \right]$?

By hand: find a basis so that $D = P^{-1}AP$ is diagonal.

How?

What does $A$ do? Consider:

DiagonalizeMatrices.png

$$Ae_1 = \left[ \begin{array}{} 0 & 1 \\ 1 & 0 \end{array} \right] \left[ \begin{array}{} 1 \\ 0 \end{array} \right] = \left[ \begin{array}{} 0 \\ 1 \end{array} \right] = e_2$$

$$Ae_2 = \left[ \begin{array}{} 0 & 1 \\ 1 & 0 \end{array} \right] \left[ \begin{array}{} 0 \\ 1 \end{array} \right] = \left[ \begin{array}{} 1 \\ 0 \end{array} \right] = e_1.$$

So, $A$ reflects the plane about the diagonal $x=y$.

Then, what are the eigenvectors? Not $e_1,e_2$, as they shouldn't change $(\pm)$ directions under $A$.

Take $v_1=\left[ \begin{array}{} 1 \\ 1 \end{array} \right]$. For this one, $A(v_1)=v_1$, and for its multiples too. So, $\lambda = 1$.

Need one more.

ReflectionOfPlaneOverDiagonal.png

Try $v_2 = \left[ \begin{array}{} -1 \\ 1 \end{array} \right]$. Then $A(v_2) = -v_2$, and $\lambda=-1$.

So, the new basis is $\{v_1,v_2\}$.

And, the change of basis matrix is

$$P = \left[ \begin{array}{} 1 & -1 \\ 1 & 1 \end{array} \right],$$

find $P^{-1}$.

Compute $P^{-1}AP$.

Easier: matrix $D$ of the operator $A$ with respect to $\{v_1,v_2\}$: $D = $ made of columns, each $= A(v_1), A(v_2)$.

So $D = \left[ \begin{array}{} 1 & 0 \\ 0 & -1 \end{array} \right]$.

Written in terms $\{v_1,v_2\}$: $A(v_1)=v_1=\left[ \begin{array}{} 1 \\ 0 \end{array} \right]$, $A(v_2)=-v_2=\left[ \begin{array}{} 0 \\ -1 \end{array} \right]$.

That's the diagonalization of $A$.

The example suggests...

2 A general approach to diagonalization

Outline: Given matrix $A$.

Suppose

  • $\lambda_1,\ldots,\lambda_n$ are the eigenvalues, all real.
  • $v_1,\ldots, v_n$ are eigenvectors,

so that $$A(v_i)=\lambda_iv_i.$$

If $\{v_1,\ldots,v_n\}$ is a basis then, in terms of this basis, $v_1 = \left[ \begin{array}{} 1 \\ 0 \\ \vdots \\ 0 \end{array} \right], v_2 = \left[ \begin{array}{} 0 \\ 1 \\ \vdots \\ 0 \end{array} \right],$ etc.

So, we can write $A$ in terms of basis $\{v_1,\ldots,v_n\}$:

First column is $\lambda_1v_1 = \left[ \begin{array}{} \lambda_1 \\ 0 \\ \vdots \\ 0 \end{array} \right]$, second $\lambda_2v_2 = \left[ \begin{array}{} 0 \\ \lambda_2 \\ 0 \\ \vdots \\ 0 \end{array} \right]$, etc.

So, $D = \left[ \begin{array}{} \lambda_1 & 0 & 0 & \ldots \\ 0 & \lambda_2 & 0 & \ldots \\ \vdots \\ 0 & 0 & \ldots & \lambda_3 \\ \end{array} \right]$.

Problem: Is $\{v_1,\ldots,v_n\}$ really a basis?

  • Case 1: All $\lambda_i$ are real and distinct.

Question: Suppose $\lambda_1 \neq \lambda_2$ are eigenvalues with eigenvectors $v_1,v_2$. Is it possible that $v_1, v_2$ are multiples?

Answer: No (otherwise plan doesn't work).

To prove, suppose $v_1=\mu v_2$ (*).

We also know this: $Av_1 = \lambda_1v_1$, $Av_1=\lambda_2v_2$.

  • (1) Apply $A$ to (*), then

$$Av_1 = A\mu v_2 = \mu Av_2.$$

Then $\lambda_1v_1 = \mu \lambda_2v_2$.

  • (2) Multiply (*) by $\lambda_2$, then $\lambda_2 v_1 = \lambda_2 \mu v_n$.
  • (3) Match (1) and (2).

Then $\lambda_1v_1 = \lambda_2v_1$ implies $\lambda_1=\lambda_2$, contradiction.


That's the main idea of the proof. (3 steps)

This was only about $2$ vectors, and one being a multiple of the other.

Now $n$ vectors $v_1,\ldots,v_n$. Then this is about being linearly independent!


Theorem:

  • Suppose $\lambda_1, \ldots, \lambda_m$ are eigenvalues of $A \colon {\bf R}^n \rightarrow {\bf R}^n$, $\lambda_1,\ldots,\lambda_m$ are real and distinct.
  • Suppose $v_1,\ldots,v_m$ are eigenvectors corresponding to $\lambda_1,\ldots,\lambda_m$ respectively.

Then

  • $v_1,\ldots,v_m$ are linearly independent.

Proof: Given $Av_i = \lambda_iv_i$, $i=1,\ldots,n$.

Proof by induction. Base $n=1$, done. Assume proven for $m-1$.

Suppose $r_1v_1 + \ldots + r_mv_m = 0$. (*)

The plan is to follow the three steps.

Do this to (*)

  1. apply $A_1$, and
  2. multiply by $\lambda_m$, then
  3. compare the results.
  • (1)

$$A(r_1v_1+\ldots+r_mv_m)=0$$ or $$r_1Av_1 + \ldots + r_mAv_m = 0$$ or $$r_1\lambda_1v_1 + \ldots + r_m\lambda_mv_m = 0.$$

  • (2)

$$\lambda_m r_1v_1 + \ldots + \lambda_m r_mv_m = 0.$$ The last terms in (1) and (2) match, so $$r_1\lambda_1v_1 + \ldots + r_{m-1}\lambda_{m-1}v_{m-1} = \lambda_mr_1v_1 + \ldots + \lambda_mr_{m-1}v_{m-1},$$ hence $$(\lambda_1-\lambda_m)r_1v_1 + (\lambda_2-\lambda_m)r_2v_2 + \ldots + (\lambda_{m-1} - \lambda_m)r_{m-1}v_{m-1} = 0.$$

Here we have $v_1,\ldots,v_{m-1}$, $m-1$ eigenvectors corresponding to different eigenvalues, so by the inductive assumption they are linearly independent.

Therefore, $$(\lambda_1 - \lambda_m)r_1=0, \ldots, (\lambda_{m-1}-\lambda_m)r_m=0.$$ Here $\lambda_i-\lambda_m \neq 0$, because they were chosen distinct.

Hence, $r_1=0, \ldots, r_{m-1}=0$.

Back to (*). What's left is $r_mv_m=0$.

Since $v_m \neq 0$, so $r_m = 0$. So, all $r_i=0$. Hence $v_1,\ldots,v_m$ are linearly independent. $\blacksquare$


Given $A \colon {\bf R}^n \rightarrow {\bf R}^n$, $\lambda_1,\ldots,\lambda_n$ are distinct real eigenvalues.

Then, by the theorem, any eigenvectors $v_1,\ldots,v_n$ chosen for $\lambda_1,\ldots,\lambda_n$ are linearly independent. Hence $\{v_1,\ldots,v_n\}$ is a basis.

Then $A$ expressed in this basis will be diagonal (with eigenvalues on the diagonal).

$$D = \left[ \begin{array}{} \lambda_1 & 0 & 0 & \ldots \\ 0 & \lambda_2 & 0 & \ldots \\ 0 & 0 & \ldots & \lambda_n \end{array} \right]$$

To summarize...

Corollary: If $A \colon {\bf R}^n \rightarrow {\bf R}^n$ has $n$ distinct real eigenvalues, then $A$ is diagonalizable: $D = P^{-1}AP$, here $D$ is diagonal.

3 Eigenvalues with multiplicities

it is of course possible that eigenvalues have multiplicities. Then the problem is that we don't have enough linearly independent eigenvectors: we need $n$ to form a basis.

We can still make it work...

Theorem: Suppose $A \colon {\bf R}^n \rightarrow {\bf R}^n$ is given and $\lambda_1,\ldots,\lambda_n$ are distinct real eigenvalues of $A$. Suppose $\{v_{i_1},\ldots,v_{i_{p_i}}\}$ are linearly independent eigenvectors of $\lambda_i$, for each $i=1,\ldots,m$. Then the combined set of these eigenvectors $$\{v_{11},\ldots,v_{1p_1},v_{21},\ldots,v_{2p_2},\ldots,v_{m1},\ldots,v_{mp_m}\}$$ is linearly independent.

Proof: Start with definition of linear dependence. Suppose

$$(r_{11}v_{11}+\ldots+r_{1p_1}v_{1p_1})+ \ldots + (r_{m1}v_{m1}+\ldots+r_{mp_m}v_{mp_m}=0.$$

Let's name these. We call the first term (in parentheses) $v_1$, the second $v_2$, and so forth until the last one is called $v_m$.

Then $v_1 + v_2 + \ldots + v_m=0$, a linear combination of $m$ vectors, one from each eigenspace:

$$v_1 \in E_A(\lambda_i), i=1,\ldots,m.$$

Therefore, $v_1, i=1,\ldots,m$ cannot be eigenvectors because eigenvectors are linearly independent, by the theorem.

So $v_i=0$ for each $i$!

In other words,

$$r_{11}v_{11}+\ldots+r_{1p_1}v_{1p_1}=0,\ldots,r_{m1}v_{m1}+\ldots+r_{mp_m}v_{mp_m}=0.$$

Observe that in each of these equations, the vectors are linearly independent; by assumption. Hence

$$r_{11}=\ldots=r_{1p_1}=0, \ldots, r_{m1}=\ldots=r_{mp_m}=0.$$

Therefore, $v_{11},\ldots,v_{mp_m}$ are linearly independent. $\blacksquare$


4 How to diagonalize

Plan: take a basis of each $E_A(\lambda_i)$. Suppose $\lambda_i$ are the real (no others) eigenvalues of $A$. Put them together, they are linearly independent by the theorem.

This is a basis if

$${\rm dim \hspace{3pt}}E_A(\lambda_1)+\ldots+{\rm dim \hspace{3pt}}E_A(\lambda_m)=n.$$

In that case we can diagonalize $A$.


Theorem: If the sum of the dimensions of the eigenspaces of real eigenvalues of $A$ add up to $n$, then $A$ is diagonalizable.

Example: Let $A = \left[ \begin{array}{} 2 & 1 \\ 0 & 2 \end{array} \right]$, then

$$\chi_A(\lambda) = {\rm det}\left[ \begin{array}{} 2-\lambda & 1 \\ 0 & 2-\lambda \end{array} \right] = (2-\lambda)^2.$$

Then the single eigenvalue is $\lambda=2$. Find $v$'s such that $Av=2v$. Solve

$$\left[ \begin{array}{} 2 & 1 \\ 0 & 2 \end{array} \right] \left[ \begin{array}{} x \\ y \end{array} \right] = 2\left[ \begin{array}{} x \\ y \end{array} \right] \rightarrow \left\{ \begin{array}{} 2x+y &= 2x \\ 2y &= 2y \end{array} \right..$$

This implies $y=0$ and $x$ is any. So

$$E_A(2) = \left\{ \left[ \begin{array}{} x \\ 0 \end{array} \right], x \in {\bf R} \right\}.$$

So ${\rm dim \hspace{3pt}}E_A(2)=1$. (Try $v=\left[ \begin{array}{} 1 \\ 0 \end{array} \right]$. That's not a basis.)

Hence $A$ is not diagonalizable!

Further analysis...

Here $\left[ \begin{array}{} 2 & 1 \\ 0 & 2 \end{array} \right]$ is a Jordan block corresponding to $\lambda=2$.

In dimension three, $\left[ \begin{array}{} 2 & 1 & 0 \\ 0 & 2 & 1 \\ 0 & 0 & 2 \end{array} \right]$, etc.

$p \times p$: $\left[ \begin{array}{} \lambda & 1 & 0 & \ldots \\ \vdots \\ 0 & 0 & \ldots & \lambda \end{array} \right]$ is a Jordan block of dimension $p$.

Any matrix (with real eigenvalues) is similar to a Jordan standard form:

$n \times n$: JordanStandardForm.png

with Jordan blocks along the diagonal and $0$'s elsewhere.

Note: Jordan blocks of sizes = multiplicites of the eigenvalues.

Exercise:

  • 1.
    • Consider $\chi_A(A)$. What is it?
    • Prove $\chi_A(A)=0$. (Hamilton-Caley theorem)
  • 2.
    • Consider $V^{**}$. What is it?
    • Prove $V^{**} \simeq V$ (even for infinite dimensional spaces).