\section{Digraph clustering} \label{sec:digraphs}
We now apply our techniques to Hermitian representations of digraphs, where the adjacency matrix is specified as in Equation~\ref{eq:adjdig}. \citet{cucuringu2020hermitian} have shown experimentally that Spectral Clustering on these matrix representations is able to recover clusters characterised by large imbalances in the direction of inter-cluster edges, in the sense that most edges between two clusters $S_i$ and $S_j$ follow the same direction. This is a \emph{higher-order} clustering problem, since clusters are defined according to their inter-cluster relations~\cite{martin_comnet}. This is in contrast with traditional undirected graph clustering, in which  a cluster is usually defined only according to its inner and outer density.

\citet{cucuringu2020hermitian} have proposed an analysis of Spectral Clustering on a directed analogue of the classical stochastic block model, while \citet{laenen2020higher} have attempted to provide an analysis for more general graphs. In Section~\ref{sec:laenen}, however, we will argue that Laenen and Sun's results fail to explain the practical performances of Spectral Clustering on digraphs.
Here we attempt to remedy this gap in the literature. We begin by defining a cost function for the task.

\begin{definition}
Let $\mathcal{G}=(V,E,w)$ be a digraph and let $k \ge 2$. Let $S_1,\dots,S_k$ be a $k$-way partition of $V$. We define the \emph{cyclic expansion} of $S_1,\dots,S_k$ as
\[
\Psi(S_1,\dots,S_k) = \frac{1}{\vol(V)} \sum_{(i,j) \notin C_k} w(S_i,S_j),
\]
where \(C_k = \{ (i,j) \ | \ j \equiv i+1 \mod k, \ 1 \leq i,j \leq k \}\), and the \emph{cyclic $k$-way expansion} of $\mathcal{G}$ as
\[
\Psi_k(\mathcal{G}) = \min_{\Part\{S_i\}_{i=1}^k} \Psi(S_1,\dots,S_k).
\]
\end{definition}

We want to find clusters $S_1,\dots,S_k$ so that most of the out-going edges from $S_i$ are connected to vertices in $S_{i+1 \text{ mod } k}$. %, for $i=1,\dots,k$. 
When $k=2$, our problem simply becomes finding \emph{disassortative} clusters. Indeed, the Hermitian Laplacian for $k=2$ is just the signless Laplacian, whose bottom eigenvectors are known to contain information about disassortative clusters \citep{maxcut,shipingliu}.
The next lemma clarifies the connection between this cost function and Hermitian Laplacians for digraphs.
\begin{lem} \label{lem:EquivalenceLambda}
    Let $\mathcal{G}$ be a connected digraph. Then, $\lambda_1(\mathcal{L})=0$ if and only if $\Psi_k(\mathcal{G}) = 0$.
\end{lem}

As observed in \cite{lisunzanetti}, the bottom eigenvector of a Hermitian Laplacian already contains information about all $k$ clusters. For this reason, given a $k$-way partition $\mathcal{S} = \{S_1,\dots,S_k\}$, we define $\chi_{\mathcal{S}} \in \mathbb{C}^N$ as follows. For any $j =1,\dots,k$, 
\[
\chi_{\mathcal{S}}(u) = \sqrt{\frac{d(u)}{\vol(V)}} \cdot \mathrm{e}^{\frac{2\pi \mathrm{i} j}{k}} \text{ for } u \in S_j,
\]
where $\mathrm{i}$ is the imaginary unit. Fundamentally, $\chi_{\mathcal{S}}$ maps each cluster to a power of the $k$-th root of unity.
The following simple but crucial lemma relates the Rayleigh quotient of $\chi_{\mathcal{S}}$ to $\Psi(S_1,\dots,S_k)$.


\begin{lem} \label{lem:PsiBounds}
Let $k \ge 2$. It holds that 
$16k^{-2} \Psi(S_1,\dots,S_k) \leq \chi_{\mathcal{S}}^* \mathcal{L}\chi_{\mathcal{S}} \le 4 \Psi(S_1,\dots,S_k).$
\end{lem}


We are now ready to state our structure theorem for digraphs.

\begin{thm}[Structure Theorem for Digraphs] \label{thm:digraph}
Let $\mathcal{G}$ be a digraph with Hermitian Laplacian $\mathcal{L}$. Assume $\lambda_2 > \lambda_1$. Given any $k$-way partition $\mathcal{S} = \{S_1,\dots,S_k\}$, there exists \(\alpha \in \mathbb{C}\) such that 
\begin{equation}
\label{eq:ours_ray}
\| f_1 - \alpha\chi_{\mathcal{S}}\|^2 \leq  \frac{\chi_{\mathcal{S}}^* \mathcal{L}\chi_{\mathcal{S}} - \lambda_1}{\lambda_2 - \lambda_1}.
\end{equation}
Furthermore, if $\mathcal{S}$ achieves $\Psi_k(\mathcal{G})$, then
\vspace{-0.1cm}
\begin{equation}
\label{eq:ours_psi}
\| f_1 - \alpha\chi_{\mathcal{S}}\|^2 \leq  \frac{4 \Psi_k(\mathcal{G}) - \lambda_1}{\lambda_2 - \lambda_1}.
\end{equation}
\end{thm} \vspace{-0.2cm}

We observe that 
$\displaystyle
\chi_{\mathcal{S}}^* \mathcal{L}\chi_{\mathcal{S}} = \frac{4}{\text{vol}(V)} \sum_{i=1}^k \sum_{j=1}^k w(S_i,S_j) \sin^2 \left( \frac{(i-(j+1)) \pi}{k} \right).$
Therefore, inequality (\ref{eq:ours_ray}) is generally stronger than (\ref{eq:ours_psi}) because it assigns a smaller penalty to edges from $S_i$ to $S_j$ when $i - (j+1)$ is small.

\subsection{Comparison with previous work}
\label{sec:laenen}
We now compare our structure theorem for digraphs to the one of Laenen and Sun \cite{laenen2020higher}. They consider the following  alternative to our cyclic expansion:
\[
\theta_k(\mathcal{G}) \triangleq \max_{\{S_i\}_{i=1}^k \text{ partition}}\sum_{i=1}^{k-1} \frac{w(S_i,S_{i+1})}{\text{vol}(S_i) + \text{vol}(S_{i+1})}.
\]
There are two differences compared with our definition of cyclic expansion. First, $\theta_k$ penalises edges from $S_k$ to $S_1$, i.e. it tries to fit a path rather than a directed cycle between clusters. While reasonable, we argue  this is not what Spectral Clustering on Hermitian Laplacians is actually doing. Secondly, and more importantly, the two cost functions differ in the normalisation based on the volume of the graph. We believe the normalisation chosen by \cite{laenen2020higher} is poorly suited to analyse Spectral Clustering. 

To motivate our  assertions, let us delve deeper into their results. We first remark  they choose a Hermitian Laplacian constructed with the $\lceil 2 \pi k \rceil$-th root of unity. They also construct a different ``indicator'' vector $\tilde{\chi}_{\mathcal{S}}$ for a $k$-way partition $\mathcal{S}$. This choice is not overly important, so we refer to their paper for further details. Their main result is as follows.

\begin{thm}[\cite{laenen2020higher}] \label{thm:laenenDigraphST}
    Let $f_1$ be the bottom eigenvector of the Hermitian Laplacian constructed with the \(\lceil 2 \pi k\rceil\)-th root of unity. Let $\eta_k(\mathcal{G}) \triangleq \frac{\lambda_2}{1 - (4/k)\theta_k(\mathcal{G})}$ and assume $\eta_k(\mathcal{G}) > 1$. Let $\mathcal{S}$ be a partition maximising $\theta_k(\mathcal{G)}$.
    There exists \(\beta \in \mathbb{C}\) such that 
$
\|f_1 - \beta \tilde{\chi}_{\mathcal{S}} \|^2 \leq (\eta_k(\mathcal{G}) - 1)^{-1}.
$
\end{thm}

We now compare Theorem~\ref{thm:digraph} with Theorem~\ref{thm:laenenDigraphST} on the two very simple digraphs of Figure~\ref{fig:cyclepath},

\begin{figure}[h!]
    \centering

    % Left minipage: TikZ subfigures
    \begin{minipage}{0.36\textwidth}
        \centering
        \vspace{0.1cm}
        \begin{adjustbox}{valign=t}  % bottom-align this block
            
            \begin{subfigure}[b]{0.48\textwidth}
                %\centering 
                \hspace*{-0.5cm}{
                % Cyclic cluster TikZ code
                \begin{tikzpicture}[scale=0.3,
                    node/.style={circle, draw, fill=blue!30, minimum size=0.1cm, inner sep=0pt},
                    edge/.style={->, thin, line width=0.4pt, opacity=0.5}]
                    % cluster colors
                    \definecolor{cluster1color}{RGB}{240,128,128}
                    \definecolor{cluster2color}{RGB}{144,238,144}
                    \definecolor{cluster3color}{RGB}{135,206,250}
                    \definecolor{cluster4color}{RGB}{221,160,221}
                    \definecolor{cluster5color}{RGB}{255,182,193}
                    % coordinates
                    \foreach \i in {1,...,5} { \coordinate (C\i) at ({72*(\i-1)}:4cm); }
                    % nodes
                    \foreach \i/\color in {1/cluster1color,2/cluster2color,3/cluster3color,4/cluster4color,5/cluster5color} {
                        \node[minimum size=2cm] (cluster\i) at (C\i) {};
                        \foreach \j in {1,...,5} {
                            \node[node, fill=\color] (v\i\j) at ($(cluster\i) + ({360/5*(\j-1)}:0.8cm)$) {};
                        }
                    }
                    % edges cyclic
                    \foreach \i in {1,...,4} {
                        \foreach \j in {1,...,5} {
                            \foreach \k in {1,...,5} {
                                \draw[edge] (v\i\j) -- (v\the\numexpr\i+1\relax\k);
                            }
                        }
                    }
                    \foreach \j in {1,...,5} {
                        \foreach \k in {1,...,5} {
                            \draw[edge] (v5\j) -- (v1\k);
                        }
                    }
                \end{tikzpicture}}
                \caption{Cycle.}
                \label{fig:Perfect5Cycle}
            \end{subfigure}%
            \hfill
            \begin{subfigure}[b]{0.48\textwidth}
                \centering
                % Path cluster TikZ code
                \begin{tikzpicture}[scale=0.3,
                    node/.style={circle, draw, fill=blue!30, minimum size=0.1cm, inner sep=0pt},
                    edge/.style={->, thin, line width=0.4pt, opacity=0.5}]
                    \definecolor{cluster1color}{RGB}{240,128,128}
                    \definecolor{cluster2color}{RGB}{144,238,144}
                    \definecolor{cluster3color}{RGB}{135,206,250}
                    \definecolor{cluster4color}{RGB}{221,160,221}
                    \definecolor{cluster5color}{RGB}{255,182,193}
                    % coordinates
                    \coordinate (C1) at (0,0);  
                    \coordinate (C2) at (5,-2); 
                    \coordinate (C3) at (0,-4); 
                    \coordinate (C4) at (5,-6); 
                    \coordinate (C5) at (0,-8);
                    \foreach \i/\color in {1/cluster1color,2/cluster2color,3/cluster3color,4/cluster4color,5/cluster5color} {
                        \node[minimum size=2cm] (cluster\i) at (C\i) {};
                        \foreach \j in {1,...,5} {
                            \node[node, fill=\color] (v\i\j) at ($(cluster\i) + ({360/5*(\j-1)}:0.8cm)$) {};
                        }
                    }
                    \foreach \i in {1,...,4} {
                        \foreach \j in {1,...,5} {
                            \foreach \k in {1,...,5} {
                                \draw[edge] (v\i\j) -- (v\the\numexpr\i+1\relax\k);
                            }
                        }
                    }
                \end{tikzpicture}
                \caption{Path.}
                \label{fig:DirectedPath5}
            \end{subfigure}
            \end{adjustbox}
            \caption{Examples of directed cluster structures.}
        \label{fig:cyclepath}
        
    \end{minipage}%
    \hspace{0.3cm}%
    % Right minipage: PNG image
    \begin{minipage}{0.6\textwidth}
        \centering
         % bottom-align with left minipage
            \includegraphics[width=0.8\textwidth]{Figures/4_cluster_path_noise.png}
        \caption{Comparison of the results given by Theorem~\ref{thm:digraph} (green for (\ref{eq:ours_ray}) and orange for (\ref{eq:ours_psi})) and by \cite{laenen2020higher} (red) for a cyclic DSBM at varying level of noise. The actual values are reported in blue. Averaged over 10 realisations.}% The standard deviation is included as filled error bars (although they are not visible as the standard deviation is very small).}
        \label{fig:5PathImage}
    \end{minipage}

\end{figure}




in which we have five clusters perfectly arranged on a directed cycle (Figure~\ref{fig:cyclepath}a)  and on a directed path (Figure~\ref{fig:cyclepath}b). Clearly, in both cases, $\Psi_k = 0$. Applying our structure theorem, this implies that, in both cases, the first eigenvector is exactly a multiple of the indicator vector $\chi_{\mathcal{S}}$. Therefore, the clusters are embedded in $k$ distinct and well-separated points, which are just the $k$ powers of the $k$-th root of unity rotated by some $\alpha \in \mathbb{C}$. Our structure theorem correctly predicts that spectral clustering will recover the clusters perfectly.

If we apply Theorem~\ref{thm:laenenDigraphST} instead, we obtain that, for the digraph of Figure~\ref{fig:Perfect5Cycle}, $\|f_1 - \beta \tilde{\chi}_{\mathcal{S}} \|^2 \le (\eta_k(\mathcal{G}) - 1)^{-1} \approx 0.642$ which is not informative at all. If we consider the digraph of Figure~\ref{fig:DirectedPath5}, we obtain a slightly better bound: $\|f_1 - \beta \tilde{\chi}_{\mathcal{S}} \|^2 \le (\eta_k(\mathcal{G}) - 1)^{-1} \approx 0.294$. This is still far from the true value, which is equal to zero.

\subsection{Experimental results}



\paragraph{Directed stochastic block models} %\label{subsec:DSBMs}
We now apply our results to the Directed Stochastic Block Model of \cite{cucuringu2020hermitian}.
Given parameters $k\ge 2,n \ge 1, P \in [0,1]^{k \times k}, F \in [0,1]^{k \times k}$, a directed stochastic block model \( \mathcal{G} \sim \text{DSBM}(k, n, P, F) \) is a random graph of $N=kn$ vertices constructed as follows: each vertex belongs to one of $k$ communities $S_1,\dots,S_k$ of $n$ vertices each. We place an edge independently at random between any two vertices $u \in S_i,v \in S_j$ with probability $P_{ij} = P_{ji}$. Furthermore, we orient the edge between $u$ and $v$ from $u$ to $v$ with probability $F_{ij}$, from $v$ to $u$ with probability $F_{ji} = 1 - F_{ij}$.

We consider a model $\mathcal{G} \sim \text{DSBM}(k, n, P, F)$ for $k=4$ and $n=100$ with $P,F$  specified as follows. \vspace{-0.2cm}

\small
\[F = \begin{pmatrix}
    &\cellcolor{yellow!30}.5 & \cellcolor{yellow!60}1 & \cellcolor{yellow!30}.5 & \cellcolor{yellow!30}.5 &\\
    &\cellcolor{yellow!0}0 & \cellcolor{yellow!30}.5 & \cellcolor{yellow!60}1 & \cellcolor{yellow!30}.5 &\\
    &\cellcolor{yellow!30}.5 & \cellcolor{yellow!0}0 & \cellcolor{yellow!30}.5 & \cellcolor{yellow!60}1  &\\
    &\cellcolor{yellow!30}.5 & \cellcolor{yellow!30}.5 & \cellcolor{yellow!0}0 & \cellcolor{yellow!30}.5 &\\
\end{pmatrix}, \
P = \begin{pmatrix}
    &\epsilon & \cellcolor{yellow!60}1 & \epsilon & \epsilon &\\
    &\cellcolor{yellow!60}1 & \epsilon & \cellcolor{yellow!60}1 & \epsilon &\\
    &\epsilon & \cellcolor{yellow!60}1 & \epsilon & \cellcolor{yellow!60}1 &\\
    &\epsilon & \epsilon & \cellcolor{yellow!60}1 & \epsilon & \\
\end{pmatrix}\]
\normalsize

$F$ represents a path structure, while this choice of $P$ means the path structure is very pronounced. \(\epsilon\) is a noise parameter: the smaller $\epsilon$ is, the closer the graph will be to having a perfect path cluster-structure.



In Figure~\ref{fig:5PathImage}, we compare the bounds of Theorem~\ref{thm:digraph} with the results of \cite{laenen2020higher}, and with the true distance between the bottom eigenvector of the Hermitian Laplacian and the indicator vector of the clusters.  Our results predict the true values exceptionally well and correctly imply Spectral Clustering will work almost perfectly for all noise levels considered. On the contrary, Laenen and Sun's result becomes non-informative even for small noise parameters. We provide a similar experiment for a DSBM with a cyclic cluster structure in the Appendix.




\begin{table}[h]
    \centering
    \captionof{table}{Comparison on the bounds on $\|f_1 -\alpha \chi_{\mathcal{S}}\|^2$ given by Theorem \ref{thm:digraph} (\ref{eq:ours_ray}) (Ours) and \citet{laenen2020higher} (LS) on real-world networks.}\label{tab:digraphs}
    \begin{tabular}{lcccccc}
        \toprule
        Network & $k$ & N & M & $\Psi$ & Ours & LS \\
        \midrule
        Yellowstone & 4 & 15 & 37 & 0.027 & 0.086 & 0.662 \\
        COVID-19    & 4 & 67 & 66 & 0.000 & 0.000 & N/D   \\
        St. Marks   & 5 & 49 & 226 & 0.154 & 0.324 & N/D   \\
        St. Martin  & 4 & 45 & 224 & 0.118 & 0.352 & N/D   \\
        Ythan       & 4 & 135 & 601 & 0.101 & 0.399 & N/D   \\
        \bottomrule
    \end{tabular}

\end{table}


\paragraph{Real-world directed networks} We apply our results to real-world directed networks and summarise our findings in Table~\ref{tab:digraphs}: for each network considered, we compare our bounds from Theorem~\ref{thm:digraph} with Laenen and Sun's Theorem~\ref{thm:laenenDigraphST}.

We first consider the directed graph analysed in \cite{laenen2020higher} made from the Data Science for COVID-19 Dataset \citep{kcdc2020}, where an edge from \(u\) to  \(v\) exists if \(u\) has infected \(v\). This graph has many disconnected components so we consider its largest weakly connected component. Spectral Clustering finds clusters fitting perfectly to a directed $4$-cycle; therefore, \(\Psi_4 = 0\) and  our bound in Theorem~\ref{thm:digraph} is $\|f_1 - \beta \chi_{\mathcal{S}} \|^2 = 0$. On the other hand, since $\eta_4 < 1$, Laenen and Sun's result is not applicable. Both this and the previous data set are characterised by unbalanced clusters, which results in a small $\theta_4$ and makes Laenen and Sun's results uninformative. 

%The Yellowstone food web \citep{yellowstone_foodweb} is a small digraph that adheres to a very clear path structure with only one vertex's edges not consistent with this structure. We delve into more detail on the performance of the bounds on this graph in the appendix.  ,

The Yellowstone \citep{yellowstone_foodweb} data set a small digraph representing a food web for Yellowstone National Park \citep{yellowstone_foodweb}, where vertices represent animal species and there is an edge from $u$ to $v$ if $u$ is predated by $v$.  As shown in Figure~\ref{fig:YellowstoneTrophicCascade}, this network can be partitioned into four clusters exhibiting an almost perfect directed path structure: only the outgoing edges for \emph{Mule deer} are not consistent with this structure. Indeed, $\Psi_4 = 0.027$ and Spectral Clustering using the Hermitian Laplacian perfectly recovers these clusters. Theorem~\ref{thm:digraph} suggests an error bound less than $0.086$, which is close to the actual value \(\|f_1 - \beta \chi_{\mathcal{S}} \|^2 = 0.039\). On the other hand, Laenen and Sun's result can only obtain an upper bound of $0.662$, which is not indicative of the actual performance of Spectral Clustering.

St. Marks Seagrass, St. Martin Island, and Ythan Estuary data sets \citep{cosin_foodwebs_dataset} are networks representing other food webs. % of, respectively, $15$, $49$, $45$,  and $135$ vertices. 
While these data sets do not present a cluster structure as obvious as the COVID-19 or Yellowstone data sets, our results imply there exists a nontrivial correlation between the indicator vector of the clusters and the bottom eigenvector of the Hermitian Laplacian. Notice that in all these three cases, \(\eta < 1\), making Laenen and Sun's bounds uninformative.

\begin{figure}[h]
    \centering
    \includegraphics[width=0.7\textwidth]{Figures/YellowstoneTrophicCascade2.png}
    \caption{A directed graph illustrating a food web for Yellowstone National Park, United States.}
    \label{fig:YellowstoneTrophicCascade}
\end{figure}

