Spectral clustering

Spectral clustering is a useful trick for separating clusters of points by connectedness, not by distance from centers as in k-means, as in this example:

Spectral clustering uses the spectrum (eigenvalues) of a matrix to cluster the points.  First, you define an N x N adjacency matrix W.  One such matrix could set $w_{i,j} = \exp(-dist_{i,j}^2 / c)$.  $dist_{i,j}$ is the distance between point i and point j, and c is a scale parameter.  So $w_{i,j}$ decreases as the distance between points i and j increases.  To make this matrix sparse and reduce computation, you could set $w_{i,j}$ to 0 if point j is not within the K-nearest-neighbors of point i.

Then sum the rows of W to get the degrees of each point.  Let G be a diagonal matrix with the degrees down the diagonal and zeros elsewhere.  Then define a matrix called the graph Laplacian by L = G – W.  Clustering the points in the space of the eigenvectors of L associated with the smallest eigenvalues will group points that are closely connected in the adjacency matrix.  (The eigenvector associated with the first smallest eigenvalue should be thrown out, as it is constant.)  Here the colors reflect the true assignments from generating the three shapes.

Clustering the points using k-means in the space of these 2 eigenvalues, then using those labels in the original coordinates shows we have grouped by connectedness this time:

Here is the R script used to produce this example.

In the 2nd edition of The Elements of Statistical Learning (available as pdf), the authors show why spectral clustering produces these ‘connected’ clusters (book page 544):

Given an eigenvector f with eigenvalue $\alpha$ of the graph Laplacian L we have:

$f^T L f = \alpha |f|^2$

$f^T L f = \sum g_i f_i^2 - \sum \sum f_i f_j w_{i,j}$

$= \frac{1}{2} \sum \sum w_{i,j} (f_i - f_j)^2$

So small values of $f^T L f$ (small eigenvalues $\alpha$) are achieved when points i and j with large $w_{i,j}$ have coordinates $f_i$ and $f_j$ close together.  Large $w_{i,j}$ corresponds to points that are close together in the original space.