Supercharging Graph Transformers with Advective Diffusion

Qitian Wu; Chenxiao Yang; Kaipeng Zeng; Michael M. Bronstein

Supercharging Graph Transformers with Advective Diffusion

Qitian Wu, Chenxiao Yang, Kaipeng Zeng, Michael M. Bronstein

Published: 01 May 2025, Last Modified: 23 Jul 2025ICML 2025 posterEveryoneRevisionsBibTeXCC BY 4.0

TL;DR: We propose a physics-principled graph Transformer for improving the generalization capability against topological/structural distribution shifts

Abstract: The capability of generalization is a cornerstone for the success of modern learning systems. For non-Euclidean data, e.g., graphs, that particularly involves topological structures, one important aspect neglected by prior studies is how machine learning models generalize under topological shifts. This paper proposes AdvDIFFormer, a physics-inspired graph Transformer model designed to address this challenge. The model is derived from advective diffusion equations which describe a class of continuous message passing process with observed and latent topological structures. We show that AdvDIFFormer has provable capability for controlling generalization error with topological shifts, which in contrast cannot be guaranteed by graph diffusion models, i.e., the generalization of common graph neural networks in continuous space. Empirically, the model demonstrates superiority in various predictive tasks across information networks, molecular screening and protein interactions

Lay Summary: The ability of machine learning models to perform well on new data is very important. For data that comes in the form of networks (like social networks or molecules), this involves understanding the underlying connections and patterns—what we call “topological structures”. Previous works fail to well understand how machine learning models perform when the network patterns change, which we call "topological shifts" in this paper. We propose AdvDIFFormer, a new model inspired by advective diffusion equations, a PDE model describing a class of natural processes that spread and mix physical quantities (like how fluids flow and mix). This model can control errors that happen when the network patterns change, which previous models cannot do. Experiment results show that this new model works better for many prediction tasks in real-world graph data, such as social networks, molecular screening and protein interactions.

Link To Code: https://github.com/qitianwu/AdvDIFFormer

Primary Area: Deep Learning->Everything Else

Keywords: geometric deep learning, graph machine learning, topological shifts, transformers, graph neural networks

Submission Number: 4926

Loading