Abstract: Graph condensation (GC) improves the efficiency of GNN training by condensing a large-scale graph into a compact synthetic graph. However, existing GC methods suffer from time-consuming optimization processes, and the underlying mechanisms driving their effectiveness remain unexplored. In this paper, we provide novel insights into the optimization strategies of GC, demonstrating that various methods ultimately converge to the class-level feature matching between the original and condensed graphs. Building on this understanding, we further refine the unified class-to-class matching paradigm into a fine-grained class-to-node paradigm, unveiling that the core mechanism of GC is a class-wise clustering problem in the latent space. Accordingly, we propose Deep Clustering-based Graph Condensation (DeepCGC), an efficient GC framework that integrates a clustering-based optimization objective with an invertible relay model. Extensive experiments show that DeepCGC achieves state-of-the-art efficiency and accuracy. Notably, it condenses the million-scale Ogbn-products graph in around 40 seconds—a $10^{2} \times$ to $10^{4} \times$ speedup over existing methods—while boosting accuracy by up to 4.6%. The code is available at https://github.com/XYGaoG/DeepCGC.
External IDs:doi:10.1109/tkde.2026.3655841
Loading