Size Transferability of Graph Transformers with Convolutional Positional Encodings

19 Jan 2026 (modified: 24 Jun 2026)Submitted to ICML 2026EveryoneRevisionsBibTeXCC BY 4.0
TL;DR: Graph Transformers with GNN-based positional encodings trained on small graphs provably generalize to larger graphs.
Abstract: Transformers have achieved remarkable success across domains, motivating the rise of Graph Transformers (GTs) as attention-based architectures for graph-structured data. A key design choice in GTs is the use of Graph Neural Network (GNN)-based positional encodings to incorporate structural information. In this work, we establish a theoretical connection between GTs with GNN positional encodings and Manifold Neural Networks (MNNs). Building on transferability results for GNNs, we prove that such GTs inherit the transferability guarantees of GNNs. In particular, GTs trained on small graphs provably generalize to larger graphs under mild assumptions. We complement our theory with extensive experiments on standard graph benchmarks, demonstrating that GTs exhibit scalable generalization behavior on par with GNNs. Our results provide new insights into the understanding of GTs and suggest practical directions for efficient training of GTs in large-scale settings.
Originally Submitted Supplementary Material: zip
Primary Area: Deep Learning->Graph Neural Networks
Keywords: Graph Transformers, Transferability, Positional Encodings, Graph Neural Networks
Submission Number: 8539
Loading