Abstract: Graphs are foundational abstractions in data-intensive domains, yet the scale of modern datasets strains computation and memory for downstream learning. From recommender systems to biological networks, graphs have emerged as a fundamental substrate for learning. As graph sizes grow, the cost of training and inference becomes prohibitive, thereby necessitating compact surrogates that retain spectral properties and feature semantics.
We propose an entropy-regularized, semi-supervised framework for attributed graph coarsening that jointly leverages the original graph’s Laplacian, node features, and partially observed labels. Central to our approach is an information-theoretic regularizer that minimizes the per-supernode Shannon entropy of the node-profile matrix $\phi = C^TY$, encouraging label-coherent aggregations. We formulate a principled objective that balances structural fidelity and feature alignment, and we solve it using an efficient block MM/BSUM algorithm. We establish learning and show structural guarantees that control Dirichlet-energy distortion, preserve low-order spectral moments, and bound deviations in cut costs and effective resistances. Experiments on standard benchmarks (e.g., Cora, Citeseer, PubMed, coauthor-CS) demonstrate that our method produces a node-profile matrix with low row-wise entropy, in which nodes with the same label are grouped into the same supernode, and achieves competitive or superior node-classification accuracy and link-prediction performance across multiple GNN backbones (GCN, GAT, APPNP), while remaining computationally efficient.
Submission Type: Regular submission (no more than 12 pages of main content)
Assigned Action Editor: ~Qitian_Wu1
Submission Number: 7903
Loading