A Closer Look at Graph Transformers: Cross-Aggregation and Beyond

Jiaming Zhuo; Ziyi Ma; Yintong Lu; Yuwei Liu; Kun Fu; Di Jin; Chuan Wang; Wenning Wu; Zhen Wang; Xiaochun Cao; Liang Yang

A Closer Look at Graph Transformers: Cross-Aggregation and Beyond

Jiaming Zhuo, Ziyi Ma, Yintong Lu, Yuwei Liu, Kun Fu, Di Jin, Chuan Wang, Wenning Wu, Zhen Wang, Xiaochun Cao, Liang Yang

Published: 18 Sept 2025, Last Modified: 29 Oct 2025NeurIPS 2025 spotlightEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Node Classification, Graph Transformer, Cross-attention Mechanism

Abstract: Graph Transformers (GTs), which effectively capture long-range dependencies and structural biases simultaneously, have recently emerged as promising alternatives to traditional Graph Neural Networks (GNNs). Advanced approaches for GTs to leverage topology information involve integrating GNN modules or modulating node attributes using positional encodings. Unfortunately, the underlying mechanism driving their effectiveness remains insufficiently understood. In this paper, we revisit these strategies and uncover a shared underlying mechanism—Cross Aggregation—that effectively captures the interaction between graph topology and node attributes. Building on this insight, we propose the Universal Graph Cross-attention Transformer (UGCFormer), a universal GT framework with linear computational complexity. The idea is to interactively learn the representations of graph topology and node attributes through a linearized Dual Cross-attention (DCA) module. In theory, this module can adaptively capture interactions between these two types of graph information, thereby achieving effective aggregation. To alleviate overfitting arising from the dual-channel design, we introduce a consistency constraint that enforces representational alignment. Extensive evaluations on multiple benchmark datasets demonstrate the effectiveness and efficiency of UGCFormer.

Supplementary Material: zip

Primary Area: Deep learning (e.g., architectures, generative models, optimization for deep networks, foundation models, LLMs)

Submission Number: 11440

Loading