On the Power of SVD in the Stochastic Block Model

Published: 21 Sept 2023, Last Modified: 02 Nov 2023NeurIPS 2023 posterEveryoneRevisionsBibTeX
Keywords: Clustering Algorithms, Stochastic Block Model, Spectral Algorithms
TL;DR: We show that, in (symmetric) stochastic block model, PCA/SVD itself is a good clustering algorithm; this explains why using PCA/SVD before running clustering algorithms improves clustering results in practice.
Abstract: A popular heuristic method for improving clustering results is to apply dimensionality reduction before running clustering algorithms. It has been observed that spectral-based dimensionality reduction tools, such as PCA or SVD, improve the performance of clustering algorithms in many applications. This phenomenon indicates that spectral method not only serves as a dimensionality reduction tool, but also contributes to the clustering procedure in some sense. It is an interesting question to understand the behavior of spectral steps in clustering problems. As an initial step in this direction, this paper studies the power of vanilla-SVD algorithm in the stochastic block model (SBM). We show that, in the symmetric setting, vanilla-SVD algorithm recovers all clusters correctly. This result answers an open question posed by Van Vu (Combinatorics Probability and Computing, 2018) in the symmetric setting.
Supplementary Material: pdf
Submission Number: 7462
Loading