CR-Graphormer: From Cascades to Tokens via Mesoscopic Graph Rewiring

Meher Chaitanya; My Le; Luana Ruiz

CR-Graphormer: From Cascades to Tokens via Mesoscopic Graph Rewiring

Meher Chaitanya, My Le, Luana Ruiz

Published: 23 Sept 2025, Last Modified: 21 Oct 2025NPGML PosterEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Graph Transformers, Tokenization

TL;DR: CR‑Graphormer uses cascade-based mesoscopic tokenization to create compact, higher-order sequences that preserve structure, keep edge computation linear, and boost node-classification accuracy.

Abstract: Graph transformers (GTs) match or surpass GNN performance by applying global self-attention, yet their quadratic memory requirement makes them impractical, and their receptive field is often limited by neighborhood aggregation, as fine-grained structural signals—especially in heterophilous graphs—are lost. We propose the Cascade-Rewired Graph Transformer (CR-Graphormer), which balances computational efficiency with rich structural awareness. This is achieved by constructing an auxiliary network in which each node is assigned a token based on a “mesoscopic edge rewiring” process generated through deterministic contagion cascades initiated from its ego-network. Replacing long multi-hop paths with direct edges in the auxiliary network yields a backbone that captures higher-order structures while retaining sparsity. The rewiring amplifies homophilous ties and preserves critical heterophilous connections present in the extended neighborhoods of each node, providing every vertex with a compact, information-rich context that reflects local motifs and global reach. Each node retains a fixed-length token list drawn from its mesoscopic neighbors; because self-attention is confined to these constant-size sequences, CR-Graphormer achieves $\mathcal{O}(E)$ complexity in graph tokenization, producing an expressive yet scalable model. We evaluate our proposed approach on 14 benchmark datasets spanning both homophilic and heterophilic settings and observe improvements in node classification accuracy on 10 of them. These results demonstrate that tokenizing over the “mesoscopic rewired graph” introduces a strong inductive bias that enhances graph learning.

Submission Number: 56

Loading