\vspace{-0.3cm}
\section{Experiments}\label{sec:floats}
This section presents experiments validating our proposed LAGC algorithm, beginning with experimental settings, followed by a comparative analysis against key baselines, and concluding with a concise demonstration of LAGC's advantages.
\vspace{-0.4cm}
\subsection{Experimental Setup}
\noindent{\textbf{Dataset}} We have performed the experiments on the datasets as shown in the Table \ref{dataset description}.

\begin{table}
 \centering
	\begin{tabular}{lllll}
	\toprule
\textbf{Dataset}&\textbf{Nodes}&\textbf{Edges}&\textbf{Features}&\textbf{Classes} \\
		\midrule
		CORA&2,708&5,429 &1,433&7 \\
 CITESEER&3,327&9,104&3,703&6\\
 DBLP&17,716&52,867&1,639&4 \\
		CO-CS &18,333&163,788&6,805&15 \\
		PUBMED&19,717&44,338&500&3 \\
		CO-PHYSICS&34,493&247,962&8,415&5 \\
		\bottomrule
	\end{tabular}
 \caption{Overview of the datasets employed for node classification }
 \label{dataset description}
\end{table}
\noindent{\textbf{Baseline Techniques:}} 
We validate our algorithm through extensive experiments on real datasets, benchmarking against  state-of-the-art methods: GCOND \citep{jin2021graph}, SCAL \citep{huang2021scaling}, and FGC \citep{pmlr-v202-kumar23a}. These selections are based on their recent advancements and superior performance, establishing them as leading coarsening approaches.

Next, we have evaluated the performance of our algorithm through node classification accuracy and time taken($\tau$)  for coarsening and classification. Experiment with real dataset using our model outperforms all other state-of-the-art method in node classification and time complexity.


% We have verified our model on real world dataset. Detailed description of the data is given in the following table \ref{dataset description} . In this section we provide the experimental settings, quantitative results to demonstrating the effectiveness of the algorithm. We also evaluated the proposed algorithm on node classification tasks with various GNN architectures, such as GCN \citep{kipf2016semi}, GAT \citep{velivckovic2017graph}, APPNP \citep{gasteiger2018predict} to show the generalizability of the proposed algorithm.
\vspace{-0.4cm}
\subsection{Node Classification}
\vspace{-0.2cm}
For the node classification task using a coarsened graph, we employ the proposed LAGC algorithm to learn a coarsened graph by considering the original graph $\mathcal{G}(\Theta, X, Y)$, utilising $80\%$ of the original graph's node labels in a semi-supervised manner. After obtaining the coarsened graph $\mathcal{G}_c(\Theta_c, \tilde{X})$, we infer coarsened graph labels using $\tilde{Y} = \arg\max(PY)$, where $P$ denotes the pseudo-inverse of the mapping matrix $C$.  Subsequently, a Graph Convolutional Network (GCN) is trained on $\mathcal{G}_c(\Theta_c, \tilde{X}, \tilde{Y})$. Testing is then performed on the remaining $20\%$ of nodes, whose labels were not utilized during coarsening. Moreover, we have also compared the node classification task using the coarsened graph with the task using the original graph. While performing the node classification task using the original graph, we maintained an identical split, utilising $80\%$ of the original graph's node labels for training and the remaining $20\%$ for testing.
In the process of node classification, we undertake the following steps:
% \begin{itemize}
%     \item \textbf{Coarsening Phase:} Employ the FGC algorithm with a sparsity regularizer to learn a coarsened graph $\mathcal{G}_c=(\tilde{V},\tilde{E},\tilde{X}, \tilde{Y})$. incorporating super-node labels $\tilde{Y}$. We determine super-node labels by selecting by $\tilde{Y} = \text{argmax}(CY) $ \citep{huang2021scaling}, where $C$ denotes the coarsening matrix.
%     \item \textbf{GNN Training:} Train a Graph Neural Network (GNN) using the data from the coarsened graph $\mathcal{G}_c$
%     \item \textbf{Assignment of Node Labels:} Employ the trained GNN model to assign labels to individual nodes within the original graph.
% \end{itemize}


% We applied these identical settings to all other coarsening algorithms while finding the node classification accuracy. Subsequently, we assessed the accuracy of our node classification by testing it on the original graph $\mathcal{G}=(V, E, X, Y)$, using the actual labels to compute accuracy (ACC). All results were derived through a 10-fold cross-validation process.





\begin{table*}[ht!]
\centering
\begin{tabular}{llllllll}
\toprule
\textbf{Data set} & \textbf{r=k/p} & \textbf{GCOND} & \textbf{SCAL} & \textbf{FGC} & \textbf{LAGC} &  \textbf{Whole Data}\\
\midrule
 & 0.3 & 81.56 $\pm$ 0.62 & 79.42 $\pm$ 1.71 & 84.03 $\pm$ 0.08 & \textbf{87.62 $\pm$ 0.01} \\
  
 CORA & 0.1 & 81.37 $\pm$ 0.40 & 71.38 $\pm$ 3.62 & 79.96 $\pm$ 0.18 & \textbf{86.10 $\pm$ 0.03} & 89.50 $\pm$ 1.20 \\
 
 & 0.05 & 78.93 $\pm$ 0.44 & 55.32 $\pm$ 7.03 & 77.31 $\pm$ 0.65 & \textbf{82.85 $\pm$ 0.02} \\
\hline

 & 0.3 & 72.43 $\pm$ 0.49 & 68.87 $\pm$ 1.37 & 72.85 $\pm$ 0.10 & \textbf{78.51 $\pm$ 1.25}  \\
 
 CITESEER & 0.1 & 70.46 $\pm$ 0.49 & 71.38 $\pm$ 3.62 & 69.46 $\pm$ 0.22 & \textbf{76.00 $\pm$ 0.50}& 78.09 $\pm$ 1.95 \\
 
  & 0.05 & 64.03 $\pm$ 2.40 & 55.32 $\pm$ 7.03 & 69.02 $\pm$ 0.24  & \textbf{75.70 $\pm$ 0.31} \\
 \hline

 & 0.05 & 93.05 $\pm$ 0.26 & 73.09 $\pm$ 7.41 & 93.31 $\pm$ 0.11 &   \textbf{94.46 $\pm$ 0.58}  \\
 
 CO-PHYSICS & 0.03 & 92.81 $\pm$ 0.31 & 63.65 $\pm$ 9.65 & 92.00 $\pm$ 1.78 & \textbf{94.28 $\pm$ 0.21}& 96.22 $\pm$ 0.74 \\ 
  & 0.01 & 92.81 $\pm$ 0.31 & 63.65 $\pm$ 9.65 & 91.08 $\pm$ 0.78  &    \textbf{93.26 $\pm$  0.89} \\ 
 \hline
 % & 0.3 & 77.77 $\pm$ 0.63 & 75.67 $\pm$ 2.57 & 82.95 $\pm$ 8.69 & \textbf{84.17 $\pm $ 0.56}\\
 & 0.05 & 78.16 $\pm$ 0.30 & 72.82 $\pm$ 2.62 & 78.14 $\pm$ 0.29 &    \textbf{82.85 $\pm$ 0.32}\\
 PubMed & 0.03 & 78.04 $\pm$ 0.47 & 70.24 $\pm$ 2.63 & 77.60 $\pm$ 0.16&          \textbf{82.10 $\pm$ 0.21}& 88.89 $\pm$ 0.57 \\
  & 0.01 & 77.20 $\pm$ 0.02 & 50.49 $\pm$ 10.5 & 76.10 $\pm$ 1.91 & \textbf{81.27 $\pm$ 0.91} \\
 \hline
 & 0.05 & 86.29 $\pm$ 0.63 & 34.45 $\pm$ 10.0 & 89.12 $\pm$ 0.08 &   \textbf{91.36 $\pm$ 0.48}   \\
  CO-CS & 0.03 & 86.32 $\pm$ 0.45 & 26.06 $\pm$ 9.29 & 86.32 $\pm$ 0.43 & \textbf{90.32 $\pm$ 0.97}& 93.32 $\pm$ 0.62\\
& 0.01 & 84.01 $\pm$ 0.02 & 14.42 $\pm$ 8.51 & 85.41 $\pm$ 0.24  & \textbf{88.27 $\pm$ 0.34} \\ 
 \hline
 & 0.05 & 79.15 $\pm$ 0.20  &  76.52 $\pm$ 2.88 &  80.08 $\pm$ 0.01             &  \textbf{81.64$\pm$ 0.42}     \\
 DBLP & 0.03 & 78.42 $\pm$ 1.26 & 75.49 $\pm$ 2.84 & 79.92 $\pm$ 0.48         &     \textbf{80.93$\pm$ 0.12}      & 85.35$\pm$ 0.86\\
    & 0.01 & 74.29 $\pm$ 0.57 & 72.01$\pm$ 1.83 & 77.47 $\pm$ 0.33  & \textbf{79.49 $\pm$ 0.53} \\ 
\hline
 % \multirow{2}{*}{DBLP} 
 % & 0.05 & 79.15 $\pm$ 0.20 & 76.52 $\pm$ 2.88 & 78.09 $\pm$ 1.88 &              \textbf{79.20 $\pm$ 0.07} & 85.35 $\pm$ 0.86 \\
 % & 0.03 & 78.42 $\pm$ 1.26 & 75.49 $\pm$ 2.84 &74.81 $\pm$ 1.57 & \textbf{78.99 $\pm$ 0.71}\\ 
\bottomrule
\end{tabular}
\caption{The table summarizes node classification accuracy on real benchmark datasets for the proposed LAGC algorithm in comparison to GCOND \citep{jin2021graph}, SCAL \citep{huang2021scaling}, and FGC \citep{kumar2023unified}. For small datasets, coarsening ratios of $r=0.3$, $0.1$, and $r=0.05$ were considered, while for large datasets, ratios of $r=0.05$, $0.03$, and $r=0.01$ were used. The proposed algorithm consistently outperforms state-of-the-art methods by a significant margin. Remarkably, on the Citeseer dataset, our method attains a higher node classification accuracy using the coarsened graph compared to the accuracy achieved when the original graph is used for training.}
\label{nodeclassificationtable0.1}
\end{table*}




\begin{algorithm}
\caption{\textsf{Node Classification using proposed LAGC}}
\label{Algorithm2}
\SetAlgoLined
\SetAlCapFnt{\footnotesize}
\SetAlCapNameFnt{\footnotesize}

\KwIn{$\mathcal{G}(\Theta, X, Y)$}
\KwOut{Trained weight matrix $W^*$}

Apply LAGC on $\mathcal{G}$  to learn $P$; $P=C^\dagger$;

Compute feature matrix of the coarsened graph: $X' = PX$;

Compute labels of the coarsened graph: $Y'=\arg\max (PY)$;

Learn $W^*$ matrix to minimize $\ell(GCN_{G_c}(W^*), Y')$;
\end{algorithm}
\vspace{-0.4cm}
\subsection{Link prediction}
\vspace{-0.2cm}
We further demonstrate the effectiveness of our proposed LAGC algorithm on downstream tasks like link prediction. We evaluated link prediction performance on three citation networks: Cora, Citeseer, and PubMed. In link prediction, the task is to predict the existence of a connection between two nodes. For our link prediction task, we employed the approach of SEAL \citep{10.5555/3327345.3327423}. We split the original graph into training and testing sets, ensuring both sets retain the same number of nodes. The training graph comprises $80\%$ of the original edges, while the remaining $20\%$ are used for testing. 
We utilize the training graph to learn the coarsened graph and train the graph neural network on this representation.
The trained model is then evaluated on the testing graph, with performance measured using the area under the ROC Curve (AUC). Additionally, we compared our model's performance on the link prediction task with the state-of-the-art FGC algorithm \citep{pmlr-v202-kumar23a} and the baseline of using the entire graph for prediction. It's important to note that GCOND\citep{jin2021graph} is a deep learning framework designed specifically for node classification tasks and is not suitable for link prediction.

\begin{algorithm}
  \caption{Link Prediction using proposed LAGC}
  \KwIn{$\mathcal{G}(\Theta, X, Y)$}
  \KwOut{GNN model $\mathcal{G_{\theta}}$}
  Randomly initialize model parameter $W^*$\;
  Treat the existing edges as Positive Examples $\mathbf{P}$\; 
  Randomly sample a set of edges to serve as negative examples $\mathbf{N}$\;
  Divide $\mathbf{P}$ and $\mathbf{N}$ into training and test sets\;
  
  Apply LAGC on train set to learn $P$; $P = C^ \dagger$\;
  Update $W^*$ by minimizing binary cross-entropy loss $\ell(GNN_{G_{C}}(W^*), Y'_{u \sim v})$\;
  % Train the Link prediction model using $\mathcal{G_{C}}$ \;
  

\end{algorithm}



\begin{table}
    \centering
    \begin{tabular}{lcccr}
    \toprule
\textbf{Data set}&\textbf{r=k/p} & \textbf{LAGC}&\textbf{FGC}& \textbf{Whole Data}\\
\midrule
         & 0.3 & 0.78  & 0.77 & \\
         Cora & 0.1 & 0.77 & 0.75 & 0.84\\
         & 0.05 & 0.75 & 0.72 & \\
         \hline
         &0.3 & 0.75 & 0.73 & \\
         CITESEER & 0.1 &  0.74 & 0. 70 & 0.78\\
         & 0.05 & 0.72 & 0.68 & \\
         \hline
         & 0.05 & 0.77 & 0.67 & \\
         PubMed& 0.03 & 0.72 &  0.70 & 0.83\\
         &  0.01 & 0.68 & 0.66 & \\
\bottomrule
    \end{tabular}
    \caption{This table presents the Area Under the ROC Curve (AUC) metric for link prediction using the proposed LAGC algorithm and the state-of-the-art FGC algorithm \cite{kumar2023unified}. The performance is evaluated at various coarsening ratios: \(r=0.3\), \(0.1\), and \(0.05\) for small datasets, and \(r=0.05\), \(0.03\), and \(0.01\) for large datasets. A baseline comparison using the entire dataset is also included. It is evident that the proposed LAGC algorithm outperforms the existing state of the art graph coarsening technique across all coarsening ratios.
    }
    \label{tab:my_label}
\end{table}
\subsection{Generalizability of proposed LAGC Algorithm}
To demonstrate the generalizability of learning a coarsened graph from our proposed algorithms, we employed various architectures to train the Graph Neural Network (GNN). Specifically, we utilized GNN architectures such as GCN \citep{kipf2016semi}, APPNP \citep{gasteiger2018predict}, and GAT\citep{velivckovic2017graph} for training and executing the node classification task. The table\ref{differentGNN} illustrates that our methods for learning the coarsened graph are compatible with different widely used GNN architectures, yielding nearly identical node classification accuracy obtained on different GNN structures.



\begin{table}[ht!]
\centering
\begin{tabular}{lccr}
\toprule
\textbf{Data set}  & \textbf{GCN} & \textbf{GAT} & \textbf{APPNP} \\
\midrule
 Cora  & 84.45 $\pm$ 0.1 & 80.23 $\pm$ 0.2 & 86.05 $\pm$ 0.4 \\
Citeseer & 75.61 $\pm$ 0.6 & 72.72 $\pm$ 0.9 & 76.40 $\pm$ 0.2 \\
Pubmed  & 80.91 $\pm$ 0.1 & 73.92 $\pm$ 0.2 & 79.62$\pm$ 0.6 \\

Co-CS  & 88.27 $\pm$ 0.3 & 84.49 $\pm$ 0.0 & 90.27 $\pm$ 0.2 \\

\bottomrule
\end{tabular}
\caption{Node classification accuracy (\%) obtained using different GNN structures like GCN \citep{kipf2016semi}, GAT \citep{velivckovic2017graph}, and APPNP \citep{gasteiger2018predict}. The experiments were conducted on various datasets, employing the LAGC algorithm with a coarsening ratio of 0.1 for Cora and Citeseer datasets and 0.01 for PubMed and Coauthor CS datasets. It is evident that the proposed LAGC method is suitable for all GNN architecture.   }
\label{differentGNN}
\end{table}
\vspace{-0.4cm}
\subsection{Node Profile Matrix}
 To quantify the coarsened graph quality betweenthe  proposed LAGC algorithm and state-of-the-art methods, we computed the mapping matrix $C$ and derived the corresponding node profile matrix $\phi=C^\top Y$. Upon comparing the heat maps of $\phi$ matrices with the recent FGC algorithm\cite{kumar2023unified}, we observed that the LAGC-generated $\phi$ matrix is significantly sparser. This sparsity indicates a higher-quality coarsened graph produced by LAGC compared to FGC \citep{kumar2023unified}. Also, note that our comparison was made with FGC solely because GCOND \citep{jin2021graph} cannot learn the mapping matrix during the coarsening process. 
 
 \noindent{\textbf{Misclassified labels:}} The coarsened graph labels, denoted as $\tilde{Y}$, are determined by selecting the class index that maximises the corresponding entry in the product $C^{\dagger}Y$ where $Y$ is some node label matrix of the original graph. The misclassified label for each supernode $i$ (where $i = 1, 2, \ldots, k$)  is computed as the sum of all non-zero entries in the $i^{th}$ row of $\phi$ matrix, excluding the maximum entry. The total misclassified labels, represented as $q$, are the summation across all nodes.

In our comparative analysis, we evaluate the performance by quantifying the number of misclassified labels ($q$). We computed misclassified labels for the Cora dataset with coarsening ratios of 0.05 and 0.1. The state-of-the-art FGC \citep{kumar2023unified} algorithm resulted in 250 and 458 misclassified points, while our proposed LAGC algorithm yielded 180 and 338 misclassified points for the respective coarsening ratios. The LAGC algorithm exhibits superior performance with fewer misclassifications.

Moreover, the heat map in the Figure \ref{phimatrices}, depicting $\phi$ matrices from both the proposed LAGC and state-of-the-art FGC \citep{kumar2023unified} algorithm, provides a visual confirmation of the efficacy of LAGC. It vividly illustrates a notable reduction in misclassified points compared to the FGC algorithm.
\begin{figure}
  \centering
  \begin{minipage}[b]{0.23\textwidth}
    \includegraphics[width=\textwidth,height=3.5 cm]{LAGC_Dataset_Cora_Coarsening Ratio_0.1_phi.png}
    \caption{$\phi$  matrix (LAGC)}
    \label{our_algo_phi}
  \end{minipage}
  \hfill
  \begin{minipage}[b]{0.23\textwidth}
    \includegraphics[width=\textwidth,height=3.5 cm]{FGC_Dataset_Cora_Coarsening Ratio_0.1_phi.png}
    \caption{$\phi$ matrix (FGC)}
    \label{fgc_algo_phi}
  \end{minipage}
  \caption{In Figure (\ref{our_algo_phi}) to (\ref{fgc_algo_phi}), we present heat maps of the $\phi$ matrix obtained from our proposed LAGC and the state-of-the-art FGC algorithm \citep{kumar2023unified}. Notably, the $\phi$ matrix derived from our algorithm exhibits greater sparsity compared to FGC, highlighting the effectiveness of our approach. Furthermore, the number of misclassified labels ($q$) is 338 and 458 for a coarsening ratio of 0.1 for the proposed LAGC and the state-of-the-art FGC algorithm \citep{kumar2023unified}, respectively. This contrast illustrates that the coarsened graph learned from our proposed algorithm has higher quality than the coarsened graph learned from the state-of-the-art method.}
  \label{phimatrices}
\end{figure}  
\vspace{-1cm}
\subsection{Run-time Complexity:}
Given an input graph with $p$ nodes, $E_1$ edges, and a feature vector of size $n$ for each node, the time complexity for node classification using a Graph Convolutional Network (GCN) with $l$ layers is $\mathcal{O}(lp^2n+lpE_1n)$ \citep{blakely2021time}.

The worst-case per iteration computational complexity of our proposed LAGC algorithm is $\mathcal{O}(p^2k)$ for learning a coarsened graph. However, when both coarsening and node classification are performed, the overall time complexity of our algorithm is $\mathcal{O}(p^2k+lk^2n+lkE_2n)$, where $k$ represents the number of nodes in the coarsened graph, and $E_2$ is the number of edges in the coarsened graph. Given that $(p >> k)$ and $E_1 >> E_2$, and choosing $k$ such that $k < n$, the time complexity for coarsening and node classification becomes significantly lower compared to performing node classification solely on the original graph. The effectiveness of this approach is evident in Table \ref{timingcomplexity}, demonstrating that the proposed LAGC algorithm is notably faster than baseline methods and exhibits similar time complexity compared to the FGC algorithm \citep{kumar2023unified, pmlr-v202-kumar23a}.

\begin{table}[ht!]
\begin{center}
 \begin{tabular}{lccccr}
 \hline
  Dataset($\tau$) & GCOND & SCAL & FGC & LAGC \\ \hline
   
    Cora & 329.8  &  27.7 & 1.71 & 1.55 \\ 
    
    Citeseer& 331.3 & 56.2 & 2.15 & 2.03   \\   

    Pubmed & 202.0 & 54.0 & 19.81 &  20.35  \\   
    
    Co-CS & 1600 & 180 &  34.4 & 49.87   \\   
  \hline
 
 \end{tabular}
 \end{center}
 \caption{The table presents a time complexity analysis comparing the proposed LAGC algorithm with baseline algorithms GCOND\citep{jin2021graph}, SCAL \citep{huang2021scaling}, and FGC \citep{kumar2023unified}, considering a coarsening ratio of $r=0.05$, where $\tau$(in sec.) is the time required to perform coarsening and classification. It is evident that the proposed LAGC is much faster than the existing baselines and comparable to FGC algorithm \citep{kumar2023unified}. }
 \label{timingcomplexity}
 \end{table}
 
