\section{Introduction}

Spectral Clustering \citep{ng2001spectral} is one of the most popular algorithms for clustering a graph. It exploits the eigenvectors of a matrix representing the graph to compute a low-dimensional Euclidean embedding of the vertices, which is then partitioned using geometric clustering algorithms, such as $k$-means.

While Spectral Clustering has been shown to work well in practice in a wide variety of settings, it still lacks a complete theoretical understanding. Typically, analyses of Spectral Clustering work by showing that the chosen eigenvectors are close to linear combinations of the indicator vectors of the clusters. For example, the structure theorem of \citet{peng2015partitioning}  informally says that the bottom $k$ eigenvectors of the graph Laplacian are close to linear combinations of the $k$ ``best'' clusters  whenever two conditions are satisfied: (1) the $k$-way expansion constant\footnote{The $k$-way expansion constant is a measure of quality of a partition~\citep{lee2014multiway} similar to the normalised cut \citep{shimalik}.} of the optimal partition is \emph{small}; (2) the $(k+1)$-st smallest eigenvalue of the Laplacian is \emph{large}. However, the first condition is often not satisfied in practice and, we argue, is actually not necessary for Spectral Clustering to perform well.


Our main contribution is a strengthening of the structure theorem: we show Spectral Clustering works well whenever the informative eigenvalues are well-separated from the rest of the spectrum. This not only allows us to recover  known bounds achieved by either the structure theorem or by classical perturbation arguments such as the Davis-Kahan  theorem \citep{davis1970rotation}, but also to obtain new improved bounds in many scenarios. In particular, our techniques are well-suited to deal with situations where there exists a hierarchy of clusters at different scales. We believe our results are the first to correctly predict the excellent performances of Spectral Clustering for a large class of real-world networks.

An additional advantage of our analysis is that it applies to any Hermitian positive semidefinite representation of a graph, beyond the standard graph Laplacian. We showcase this strength by analysing recently proposed Hermitian representations of digraphs \citep{cucuringu2020hermitian}. In particular, we consider a clustering problem  in which we aim to partition the vertices of a digraph into subsets $S_1,\dots,S_k$ so that most of the edges are from $S_i$ to $S_{i+1}$ for any $i \in \{1,\dots,k-1\}$. In other words, we would like most of the edges to follow a directed path between the clusters. This has applications, for example, in uncovering trophic levels in food chains or detecting patterns in trade networks \citep{laenen2020higher}. %\citep{laenen2020higher,james24}.
We provide a new cost function for this task and apply our structure theorem to obtain bounds that accurately predict the performances of Spectral Clustering on both synthetic and real-world data sets.

\subsection{Related work}
There is a wealth of literature on Spectral Clustering; in this section we discuss only the most relevant work on the subject and we refer the reader to the classical surveys \cite{fortunatosurvey,vonlux} for additional background. 

In the case of random graphs, and specifically stochastic block models, analyses of Spectral Clustering typically rely on the Davis-Kahan theorem \citep{davis1970rotation} or similar tools \citep{rohe11}. Classical perturbation arguments, however, are poorly suited to deal with non-random graphs in which ``noise'' might be localised in relatively small regions of the graph~\citep{phdzanetti}. %. 
For this reason, \citet{peng2015partitioning}  introduced their structure theorem, which states that the bottom $k$-eigenvectors of the normalised Laplacian of an undirected graph are close to the indicator vectors of the ``optimal'' $k$ clusters whenever the ratio $k^2 \cdot \rho(k)/\lambda_{k+1}$ is small, where $\rho(k)$ is the $k$-way expansion constant \citep{lee2014multiway}, and $\lambda_{k+1}$ is the $(k+1)$-th smallest eigenvalue of the Laplacian. A series of recent results \cite{kolev,mizutani}, culminating with \citep{macgregor2022tighter}, further simplified the proof of the structure theorem, while improving the dependency on the number of clusters $k$.

All of these results, however, require $\rho(k)$ to be small: this is often not satisfied in real-world networks nor in stochastic block models, even for a range of parameters where Spectral Clustering is known to work well. 
% \color{red} 
% Need changing. 
% \color{blue}
% Our results bypass this limitation by considering the ratio between $\rho(k)$ and the difference $\lambda_{k+1} - \lambda_2$: we do not require the $\rho(k)$ to be small, but just to be larger than this difference. This holds whenever the \emph{inner} expansion of the clusters is sufficiently stronger than their \emph{outer} expansion \citep{OGT}. We further generalise this argument to provide guarantees for Spectral Clustering whenever the informative eigenvalues of the graph Laplacian appear in groups, well-separated from the rest of the spectrum. This happens when there are well-defined clusters organised hierarchically.
% \color{black} \\
Consider for example a graph consisting of two cliques of 50 vertices connected by a perfect matching in which each edge has weight $20$. Spectral clustering perfectly partitions the graph by dividing the two cliques. If we apply the structure theorem of \citet{macgregor2022tighter}, however, we obtain a bound on the distance between the eigenvector of the Laplacian associated with $\lambda_2$ and the closest linear combination of indicator vectors of the clusters equals to $0.28$. This is because, even though there exists a large gap in the Laplacian spectrum, the value of $\rho(2)$ is relatively large ($\approx 0.28$). In contrast, our Corollary~\ref{cor:RemoveFirstEvec} will show that this distance is actually zero, correctly predicting the exact recovery of the clusters by spectral clustering.

Furthermore, our analysis is not restricted to the traditional (normalised) graph Laplacian and undirected graphs, but it can handle any Hermitian and positive semidefinite representation of undirected or directed graphs. Hermitian representations of digraphs were recently investigated for clustering purposes by \citet{cucuringu2020hermitian}, who proposed their use to recover clusters characterised by a strong imbalance in the direction of the inter-cluster edges. They define a directed stochastic block model that captures this problem and show  Spectral Clustering is able to recover its communities. \citet{laenen2020higher} consider the specific instance of this clustering task in which the direction of most edges is required to follow a directed path between the clusters. They apply the techniques of \citet{peng2015partitioning} to obtain a structure theorem and an analysis of Spectral Clustering for appositely-constructed Hermitian representations of digraphs. In our work, we revisit this task and argue that the results of Laenen and Sun do not capture the practical performances of Spectral Clustering. Indeed, we demonstrate simple examples where Spectral Clustering works very well, but Laenen and Sun's results are completely uninformative. Applying our improved structure theorem together with a new cost function allows us to prove bounds that correctly predict Spectral Clustering's practical performances, vastly outperforming Laenen and Sun's results.

\subsection{Organisation}
The paper is organised as follows: in Section~\ref{sec:background}, we cover the necessary background. In Section~\ref{sec:general}, we state our main result, which is a generalised and improved structure theorem. We show how our results allow us to obtain meaningful bounds for Spectral Clustering, even for graphs where the $k$-way expansion constant is relatively large. In Section~\ref{sec:digraphs}, we apply our generalised structure theorem to analyse Spectral Clustering on Hermitian Laplacians for digraphs. Proofs, together with additional experiments, are included in the Appendix. Code is available in a GitHub repository\footnote{\url{https://github.com/GeorgeRLTyler/Improved-and-Generalised-Analysis-for-Spectral-Clustering}}.