Spectral Graph Theory and Segmentation
Motivation
Spectral Graph Theory is a powerful tool as it sits at the center of multiple representation.
- Connects to Graph and manifold, and linear algrbra.
- It’s related to dynamics on graph, related to Markov chain, random walk (diffusion.)
- Could be applied to any point cloud: images, meshes are suited.
- Could be used to perform clustering, segmentation etc.
Linear Algebra Review
There are several ways to see a eigenvalue problem
- Ax=λx Can be seen as finding the invariant subspace and its scaling in a linear transformation.
- det(A−λI)=0 Solution to the eigen polynomial. This defines the algebraic distribution of eigenvalues. Many properties could be derived from it.
- maxxTAxxTx Can be seen as solution to this optimization problem. Stationary points are eigen values. Maximum and minimum correspond to maximal and minimal eigen value.
Spectral Graph theory is just using this 3rd view, how can we relax the hard combinatorial segmentation problem into some convex easy to solve optimization problem.
Graph Theory
Given a graph V,E , a adjacency matrix A could be defined. If the graph is un-directed, then the matrix is real symmetric. Thus a n node graph will have n eigenvalues and n eigen vectors.
Below I’ll discuss Laplacian matrix majorly, but here I’ll point out a few application of the adjacency matrix A.
- Eigen vector of A is used in PageRank algorithm. As only the eigenvector of the largest eigen value will be all positive, the entries in that vector is defined as eigen centrality. (importance of a node to a network)
- A could be normalized to be the transition matrix of a random walk on graph.
Graph Laplacian and Physics
Then the graph Laplacian could be defined as L=D−A while D=diag(di)=diag(∑jAij)
- It’s positive semi-definite and symmetric
- have n non-negative eigen values.
L could be understood as operator on functions on graph. For a value vector x, the action of L on the vector is Lx=Dx−AxLx(i)=∑jLijxj=dixi−∑jAijxj=∑jAij(xi−xj) In the binary network case Lx(i)=∑jLijxj=∑j∈N(i)(xi−xj) Physically, if you think of A as the conductance in an electric or water pipe network, then Lx(i) is the outward flow from node i, when you apply the potential x at each node. So pictorially, we can understood it as diffusivity vector. Thus we can write the dynamics of flow on the network as dvdt=Lv=(D−A)v Besides, from a static view point the construct xTLx defines a kind of strain or potential energy E(x) for a network A given the potential x. And the flow defined above is just the gradient descent flow of this potential. Mechanically, it’s embedding the graph in a 1d line, giving each node a coordinate x(i), and E(x) is the energy of this configuration, given the weight A as the strength of spring kij.1
xTLx=∑ijxiLijxj=∑i,j∈ExiAij(xi−xj)=∑i,j∈ExjAij(xj−xi)=12∑i,j∈EAij(xi−xj)2=ESpectrum of Laplacian
Using these physical models, we could understand the spectrum structure of Laplacian, i.e.
Zero Eigenvalue
- 0 eigen value corresponds to 0 strain, i.e. equipotential across each connected components.
- Thus the dimension of 0 eigen space is the number of connected components in the graph.
- Each of those eigen vector is the indicator vector of each component.
- Laplacian of non-connected components are diagonal blocks of a matrix.
- Spectrum of a non-connected network is a union of each of its connected components.
Fiedler Vector (2nd smallest)
- A small eigen value corresponds to small strain, and small flow.
- For a connected graph, Fiedler value and Fiedler vector correspond to the smallest non-zero eigenvalue.
- Thus it describe the (non-uniform) stationary configuration with the least strain E(x).
- It’s the solution to this minimization problem, which is a convex programming.
- As eigen vectors are minimizing spring energy, they are really useful in visualizing graph / networks.
- Using v2,v3 as coordinates to visualize a graph.
- More could be used to embed higher dim structure.
Spectral Clustering
Application
- Point cloud could be represented as a network. Nearest neighbor (kNN) could be used. And then spectral graph analysis could be run.
- 3D mesh could be represented as network as well! Weight edge could encode the distance between the nodes. wij∝|vi−vj|2σ2
Embedding
Many embedding could be thought of as this energy minimization problem, i.e. stronger spring for points closer by in the cloud, looser or no spring for points farther apart. The only tricky part is how to regularize it
So a typical regularization is this, enforcing each coordinate to be unit variance and orthogonal to each other (no covariance). argmin∑ijwij‖fi−fj‖2,s.t.FTF=I Then, this could be easily optimized by solving the smallest d non-zero eigen vector for the matrix L. Note by properties of Laplacian matrix, F will be zero mean automatically. Lij=wijLF=ΛF Then Fi,: will be the embedded coordinate for point i.
Another choice is using Λ−1/2F as coordinate, rescale the embedded axis according to the eigen values! Seems the path length properties will be better.
Clustering
One way of clustering is to do the spectral embedding above, and then do a simple clustering algorithm like k-means in the embedding d space.
So a general spectral clustering algorithm consists of
- Defining a distance (similarity) metric among nodes.
- Use the distance metric to define edge weight, make a graph out of it.
- Compute spectrum of graph laplacian, and the least few eigen vectors.
Segmentation
The most famous work of using graph-cut in computer vision is Shi Malik 2000 . Their major contribution is to modify the objective of the original graph cut, adding a normalization factor, to prevent cutting out too small a piece.
For a 2 segment partitioning problem, the Ncut
loss is defined as
Ncut(A,B)=cut(A,B)Ass(A,V)+cut(A,B)Ass(B,V)
After that they find this could be relaxed and solved through a generalized eigenvalue problem.
Computation
These spectral methods all require computation of eigen vector corresponding to small eigen value. It could be performed through Lancosz algorithm. And currently through pre-conditioning method etc. the best performing algorithm is of O(n) complexity of getting that vector.
Reference
See Graph Cut Note for more info.
Spectral Graph Theory Tutorial
-
Thus solving the minimal energy state for a spring network could be easily done by solving the minimal eigenvalue problem of the Laplacian (with constraint on |x|). ↩