GIST: Gauge-Invariant Spectral Transformers for Scalable Graph Neural Operators

GIST: Gauge-Invariant Spectral Transformers for Scalable Graph Neural Operators

ICLR 2026 Conference Submission20905 Authors

19 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Graph Transformers, Neural Operators, Graph Neural Networks

TL;DR: GIST introduces gauge-invariant graph transformers that achieve linear scalability while maintaining spectral representation invariances for both graph learning and mesh-based neural operators

Abstract: Adapting transformers to meshes and graph-structured data presents significant computational challenges, particularly when leveraging spectral methods that require eigendecomposition of the graph Laplacian, a process incurring cubic complexity for dense matrices or quadratic complexity for sparse graphs, a cost further compounded by the quadratic complexity of standard self-attention mechanism. Conventional approximate spectral methods compromise the gauge symmetry inherent in spectral basis selection, risking the introduction of spurious features tied to the gauge choice that could undermine generalization. In this paper we propose a transformer architecture that is able to preserve gauge symmetry through distance-based operations on approximate randomly projected spectral embeddings, achieving linear complexity while maintaining gauge invariance. By integrating this design within a linear transformer framework, we obtain end-to-end memory and computational costs that scale linearly with number of nodes in the graph. Unlike approximate methods that sacrifice gauge symmetry for computational efficiency, our approach maintains both scalability and the principled inductive biases necessary for effective generalization to unseen graph structures in inductive graph learning tasks. We demonstrate our method's flexibility by benchmarking on standard transductive and inductive node classification tasks, achieving results matching the state-of-the-art on multiple datasets. Furthermore, we demonstrate scalability by deploying our architecture as a discretization-free Neural Operator for large-scale computational fluid dynamics mesh regression, surpassing state-of-the-art performance on aerodynamic coefficient prediction reformulated as a graph node regression task.

Primary Area: learning on graphs and other geometries & topologies

Submission Number: 20905

Loading