next up previous contents index
Next: Matrix decompositions Up: Matrix decompositions and latent Previous: Matrix decompositions and latent   Contents   Index


Linear algebra review

We briefly review some necessary background in linear algebra. Let $\lsimatrix$ be an $\lsinoterms\times \lsinodocs$ matrix with real-valued entries; for a term-document matrix, all entries are in fact non-negative.

The rank of a matrix is the number of linearly independent rows (or columns) in it; thus, ${\mbox rank}(\lsimatrix)\leq \min\{\lsinoterms,\lsinodocs\}$. A square $r\times r$ matrix all of whose off-diagonal entries are zero is called a diagonal matrix; its rank is equal to the number of non-zero diagonal entries. If all $r$ diagonal entries of such a diagonal matrix are $1$, it is called the identity matrix of dimension $r$ and represented by $I_r$.

For a square $\lsinoterms\times \lsinoterms$ matrix $\lsimatrix$ and a vector $\vec{x}$ that is not all zeros, the values of $\lambda$ satisfying

\begin{displaymath}
\matrix{\lsimatrix}\vec{x} = \lambda \vec{x}
\end{displaymath} (213)

are called the eigenvalues of $\matrix{\lsimatrix}$. The $\lsinodocs$-vector $\vec{x}$ satisfying Equation 213 for an eigenvalue $\lambda$ is the corresponding right eigenvector. The eigenvector corresponding to the eigenvalue of largest magnitude is called the principal eigenvector. In a similar fashion, the left eigenvectors of $\lsimatrix$ are the $\lsinoterms$-vectors $y$ such that
\begin{displaymath}
\vec{y}^T\matrix{\lsimatrix} = \lambda \vec{y}^T.
\end{displaymath} (214)

The number of non-zero eigenvalues of $\lsimatrix$ is at most $\mbox{rank}(\lsimatrix)$.

The eigenvalues of a matrix are found by solving the characteristic equation, which is obtained by rewriting Equation 213 in the form $(\lsimatrix-\lambda
I_\lsinoterms)\vec{x}=0$. The eigenvalues of $\lsimatrix$ are then the solutions of $\vert(\lsimatrix-\lambda I_\lsinoterms)\vert=0$, where $\vert S\vert$ denotes the determinant of a square matrix $S$. The equation $\vert(\lsimatrix-\lambda I_\lsinoterms)\vert=0$ is an $\lsinoterms$th order polynomial equation in $\lambda$ and can have at most $\lsinoterms$ roots, which are the eigenvalues of $\lsimatrix$. These eigenvalues can in general be complex, even if all entries of $\lsimatrix$ are real.

We now examine some further properties of eigenvalues and eigenvectors, to set up the central idea of singular value decompositions in Section 18.2 below. First, we look at the relationship between matrix-vector multiplication and eigenvalues.

Worked example. Consider the matrix

\begin{displaymath}
S=\left(
\begin{array}{ccc}
30 & 0 & 0 \\
0 & 20 & 0 \\
0 & 0 & 1\\
\end{array} \right).
\end{displaymath} (215)

Clearly the matrix has rank 3, and has 3 non-zero eigenvalues $\lambda_1=30,$ $ \lambda_2=20$ and $\lambda_3=1$, with the three corresponding eigenvectors
\begin{displaymath}
\vec{x_1}=\left(
\begin{array}{c}
1 \\ 0 \\ 0 \\
\end{a...
...left(
\begin{array}{c}
0 \\ 0 \\ 1 \\
\end{array} \right).
\end{displaymath} (216)

For each of the eigenvectors, multiplication by $S$ acts as if we were multiplying the eigenvector by a multiple of the identity matrix; the multiple is different for each eigenvector. Now, consider an arbitrary vector, such as $\vec{v}=\left(
\begin{array}{c}
2 \\
4 \\
6 \\
\end{array} \right).
$ We can always express $\vec{v}$ as a linear combination of the three eigenvectors of $S$; in the current example we have
\begin{displaymath}
\vec{v}=\left(
\begin{array}{c}
2 \\
4 \\
6 \\
\end{array} \right)=2\vec{x_1}+4\vec{x_2}+6\vec{x_3}.
\end{displaymath} (217)

Suppose we multiply $\vec{v}$ by $S$:
$\displaystyle S\vec{v}$ $\textstyle =$ $\displaystyle S(2\vec{x_1}+4\vec{x_2}+6\vec{x_3})$ (218)
  $\textstyle =$ $\displaystyle 2S\vec{x_1}+4S\vec{x_2}+6S\vec{x_3}$ (219)
  $\textstyle =$ $\displaystyle 2\lambda_1 \vec{x_1}+4\lambda_2\vec{x_2}+6\lambda_3\vec{x_3}$ (220)
  $\textstyle =$ $\displaystyle 60\vec{x_1}+80\vec{x_2}+6\vec{x_3}.$ (221)

End worked example.

Example 18.1 shows that even though $\vec{v}$ is an arbitrary vector, the effect of multiplication by $S$ is determined by the eigenvalues and eigenvectors of $S$. Furthermore, it is intuitively apparent from Equation 221 that the product $S\vec{v}$ is relatively unaffected by terms arising from the small eigenvalues of $S$; in our example, since $\lambda_3=1$, the contribution of the third term on the right hand side of Equation 221 is small. In fact, if we were to completely ignore the contribution in Equation 221 from the third eigenvector corresponding to $\lambda_3=1$, then the product $S\vec{v}$ would be computed to be $ \left(
\begin{array}{c}
60 \\
80 \\
0 \\
\end{array} \right)$ rather than the correct product which is $ \left(
\begin{array}{c}
60 \\
80 \\
6 \\
\end{array} \right)$; these two vectors are relatively close to each other by any of various metrics one could apply (such as the length of their vector difference).

This suggests that the effect of small eigenvalues (and their eigenvectors) on a matrix-vector product is small. We will carry forward this intuition when studying matrix decompositions and low-rank approximations in Section 18.2 . Before doing so, we examine the eigenvectors and eigenvalues of special forms of matrices that will be of particular interest to us.

For a symmetric matrix $S$, the eigenvectors corresponding to distinct eigenvalues are orthogonal. Further, if $S$ is both real and symmetric, the eigenvalues are all real.

Worked example. Consider the real, symmetric matrix

\begin{displaymath}
S=\left(
\begin{array}{cc}
2 & 1 \\
1 & 2 \\
\end{array} \right).
\end{displaymath} (222)

From the characteristic equation $\vert S-\lambda I\vert=0$, we have the quadratic $(2-\lambda)^2-1=0$, whose solutions yield the eigenvalues $3$ and $1$. The corresponding eigenvectors $\left(
\begin{array}{c}
1 \\
-1 \\
\end{array} \right)
$ and $\left(
\begin{array}{c}
1 \\
1 \\
\end{array} \right)
$ are orthogonal. End worked example.



Subsections
next up previous contents index
Next: Matrix decompositions Up: Matrix decompositions and latent Previous: Matrix decompositions and latent   Contents   Index
© 2008 Cambridge University Press
This is an automatically generated page. In case of formatting errors you may want to look at the PDF edition of the book.
2009-04-07