This site is devoted to mathematics and its applications. Created and run by Peter Saveliev.
PageRank as a solution of a matrix equation
From Intelligent Perception
Briefly, it's not, as sometimes there is no solution to that equation.
Suppose $p_1,...,p_N$ are the $N$ pages/nodes in the graph (of the World Wide Web). Suppose also that $R(p_i)$ is the PageRank of node $p_i$.
PageRank is defined (and computed) iteratively, with an initial distribution of rankings usually given by: $$R_0(p_i) = \frac{1}{N},i=1,2,...,N.$$ Then at time $t+1$, the ranking are given in terms of the rankings at time $t$: $$R_{t+1}(p_i) = d \sum_{p_j \in S(p_i)} \frac{R_t (p_j)}{L(p_j)} + \frac{1-d}{N},$$ where
- $S(p_i)$ is the set of nodes that (inbound) link to $p_i$,
- $L(p_j)$ is the number of (outbound) links from node $p_j$.
Let $$R_t=(R_t(p_1),...,R_t(p_N))$$ be the vector of ranks at time $t$. Then, in matrix notation: $$R_{t+1} = d MR_t + (1-d)R_0, $$ where the matrix $M$ is defined as $$M_{ij} = \begin{cases} 1 /L(p_j) & \mbox{if }p_j \in S(p_i),\ \\ 0 & \mbox{if }p_j \notin S(p_i). \end{cases}$$ In other words, $$M = (K^{-1} A)^T,$$ where $A$ is the adjacency matrix of the graph and $K$ is the diagonal matrix with the outdegrees $L(p_j)$ in the diagonal.
Now, the iteration is supposed to converge to a "steady state" $R$. In that case, iterations shouldn't change the value: $R_{t+1} =R_{t}.$ Hence the PageRank is a solution to this matrix equation: $$R = d MR + (1-d)R_0.$$
Now, the questions we'd like to ask:
- Why does a solution exist?
- Why is this solution unique?
Clearly, for what PageRank is used, you need both.
We can re-write: $$(I-dM)R = (1-d)R_0,$$ where $I$ is the $n × n$ identity matrix. Then the solution to this equation is easy to find: $$R = d(1-d) \left( \frac{1}{d}I-M \right) ^{-1}R_0,$$ provided, of course, the inverse matrix exists.
From linear algebra we know that it does exists -- if and only if $\mu = \frac{1}{d}$ is not an eigenvalue of $M$. In that case, the solution $R$ is also unique.
Let's consider the eigenvalues of $M$.
Since $\mu = \frac{1}{d}>1$, it's is not an eigenvalue of $M$ -- if we know that all of them are less than or equal $1$. How do we know?
Perron–Frobenius theorem. Suppose $A = (a_{ij})$ is an $n × n$ irreducible non-negative matrix: $a_{ij} \geq 0$ for $1 ≤ i, j ≤ n$. Then it has a positive real eigenvalue $r>0$ such that
- the eigenspace of $r$ is 1-dimensional.
- there exists an eigenvector $v = (v_1,…,v_n)$ of $r$ such that all its components are positive: $v_i > 0$ for $ 1 ≤ i ≤ n$.
- all eigenvalue $λ$ satisfies $|λ| \leq r$.
Let's check if the conditions of the theorem apply to $M$ defined above.
Matrix $M$ is non-negative.
A square matrix $M$ is called irreducible iff, in particular, its associated directed graph $G_M$ (there is an edge from node $p_i$ to node $p_j$ iff $M_{ij} > 0$) is strongly connected (there is a path from each node to every other node). This isn't true because the web always has pages with no inbound links and possibly some with no outbound links...
In this case, we have no reason to believe that any particular PageRank distribution exists for the given state of the web.
A common way of trying to deal with this problem is to add "missing" links. That's bad math.
Let's ignore this and try to continue anyway. Remember we also need $r=1$ in order to insure that $\mu = \frac{1}{d}$ is not an eigenvalue of $M$.
A column stochastic matrix is a square matrix each of whose columns consists of non-negative real numbers whose sum is equal to 1. Matrix $M$ is column stochastic. Indeed, from its definition above the sum of all elements in column $j$ of $M$ is $$\sum\limits_{i:p_j \in S(p_i)} \frac{1}{L(p_j)} = \frac{1}{L(p_j)} \sum\limits_{i:p_j \in S(p_i)} 1 = \frac{1}{L(p_j)} L(p_j) =1.$$
The point is, the Perron–Frobenius theorem ensures that every stochastic (still irreducible) matrix has an eigenvalue equal to 1 and it's the largest absolute value of any eigenvalue.
Even if $R$ exists, questions remain:
- Why does this solution $R$ have only non-negative entries?
- Item 2 above doesn't apply since $R$ isn't an eigenvector.
- Why does the iteration $R_t$ converge to the "steady state" $R$?
- It might diverge away from $R$ or it may be cyclical. "Observing" convergence doesn't prove anything (case in point: harmonic series).
In light of PageRank's dependence on the damping factor and other problems this result isn't surprising.