How to separate a graph? A first look at spectral method.

Planted partition can be intuitively regarded as, literally, separating a graph according to your plan. The problems such as bisection, k-coloring and maximum clique are known to be hard, and actually, most problems are hard unless proven easy.

In this post, we firstly try to classify the spectral graph model, and then given out the matlab code for partitioning a bipartite graph.

A general model for structural graph is:

\mathcal{G}(\psi,P): let \psi: \{ 1,\ldots,n\}\rightarrow \{ 1,\ldots,k\} which meaning the mapping from vertices to classes, and P be its generating matrix of size k\times k, where P_{i,j}\in\{0,1\}. So the connecting probability of any edge (u,v) is P_{\psi(u),\psi(v)}.

So the weighted adjacency matrix of \mathcal{G}(\psi,P) looks like

\[ \left( \begin{array}{cccc}<br /><br /><br /> P_{1,1}& P_{1,1}& P_{1,2}& P_{1,2}\\<br /><br /><br /> P_{1,1}& P_{1,1}& P_{1,2}& P_{1,2}\\<br /><br /><br /> P_{1,2}& P_{1,2}& P_{2,2}& P_{2,2}\\<br /><br /><br /> P_{1,2}& P_{1,2}& P_{2,2}& P_{2,2}<br /><br /><br /> \end{array} \right)\] ,

where the symmetric property of this matrix descents from that it is an in-directed graph.

We further specialize the bisection model hereby

  • Planted Bisection (\psi,p_1,p_2,q): the connecting probability of connecting to its own section is p_1 and p_2 respectively, while that of crossing is q.

Suppose we are now given a graph of this form, and want to simulate the process of partitioning all these nodes into two sections, with least possible crossing edges. Before delving into those technical details, we briefly describe a spectral partitioning principle for our simulation in this post:

The eigenvector of the second largest eigenvalue of the adjacency matrix \mathbf{A} is a slightly perturbed version of its generator matrix \mathbf{M}, which is of the form

\mathbf{M}=\[ \left( \begin{array}{cccc}<br /><br /> p_1 & p_1  & q & q  \\<br /><br /> p_1 & p_1  & q & q \\<br /><br /> q & q &    p_2 & p_2  \\<br /><br /> q & q &    p_2 & p_2 \\ \end{array} \right)\] .

Now we run a MATLAB simulation to simulate how that spectral partitioning method works.

  1. Generate a graph with n vertexes, and n_1 vertexes are connecting each other with probability p_1, while the other n-n_1 vertexes are self connecting with probability p_2. And that of their crossing is q.
clc;clear all;
n=800;
x=randperm(n);%put the sets among random positions
n_1=round(n/4);
ind_set1=x(1:n_1);
ind_set2=x(n_1+1:end);
p_1=0.8;
p_2=0.8;
q=0.001;
A(ind_set1,ind_set1)=rand(n_1,n_1)<p_1;
A(ind_set2,ind_set2)=rand(n-n_1,n-n_1)<p_2;
A(ind_set1,ind_set2)=rand(n_1,n-n_1)<q;
A(ind_set2,ind_set1)=rand(n-n_1,n_1)<q;
A=triu(A,1);%to make the adjacency matrix symmetric
A=A+A';
figure(1)
spy(A);

When this graph is generated, we can hardly tell any relation among its connection according to figure 1.

 

Again, we should bear in mind that the second eigenvector \mathbf{v}_2 of \mathbf{A} is a perturbed version of \mathbf{w}_2 of the generator matrix \mathbf{M}. (means the values of \mathbf{v}_2 are just a little deviated from \mathbf{w}_2) And why is \mathbf{w}_2 this important? because it is of the form [k_1p_1,k_1p_1,k_2p_2,k_2p_2,k_2p_2,k_1p_1], which reflects the relation of all these vertexes. So our second step of the code is

  1. Get the second eigenvector of \mathbf{A}, and re-sort the positions of these n vertexes.
[V D] = eigs(A, 2);
[useless ind_recover] = sort(V(:,2));
A_hat=A(ind_recover,ind_recover);
figure(2)
spy(A_hat);

Then we can see from figure 2 below that the relation it reflects is really a bisection.

 

 

References:

1. Daniel A. Spielman, “Spectral partitioning in the planted partition model”, Lecture notes(Lecture 21), 2009.

2. Frank McSherry, “Spectral partitioning of random graphs”, STOC, 2001.

3. David Gleich, “Spectral Graph Partitioning and the Laplacian with Matlab”, 2006. (Online resources)

Comments are closed.