NodeFormer: A Scalable Graph Structure Learning Transformer for Node Classification

Qitian Wu; Wentao Zhao; Zenan Li; David Wipf; Junchi Yan

NodeFormer: A Scalable Graph Structure Learning Transformer for Node Classification

Qitian Wu, Wentao Zhao, Zenan Li, David Wipf, Junchi Yan

Published: 31 Oct 2022, Last Modified: 06 Apr 2025NeurIPS 2022 AcceptReaders: Everyone

Keywords: Graph Neural Networks, Graph Transformers, Large Graphs, Node Classification, Scalability, Graph Structure Learning

Abstract: Graph neural networks have been extensively studied for learning with inter-connected data. Despite this, recent evidence has revealed GNNs' deficiencies related to over-squashing, heterophily, handling long-range dependencies, edge incompleteness and particularly, the absence of graphs altogether. While a plausible solution is to learn new adaptive topology for message passing, issues concerning quadratic complexity hinder simultaneous guarantees for scalability and precision in large networks. In this paper, we introduce a novel all-pair message passing scheme for efficiently propagating node signals between arbitrary nodes, as an important building block for a new class of Transformer networks for node classification on large graphs, dubbed as NodeFormer. Specifically, the efficient computation is enabled by a kernerlized Gumbel-Softmax operator that reduces the algorithmic complexity to linearity w.r.t. node numbers for learning latent graph structures from large, potentially fully-connected graphs in a differentiable manner. We also provide accompanying theory as justification for our design. Extensive experiments demonstrate the promising efficacy of the method in various tasks including node classification on graphs (with up to 2M nodes) and graph-enhanced applications (e.g., image classification) where input graphs are missing. The codes are available at https://github.com/qitianwu/NodeFormer.

Supplementary Material: pdf

TL;DR: A scalable graph Transformer for large-scale graphs, which achieves all-pair message passing with linear complexity w.r.t. node numbers

Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 2 code implementations](https://www.catalyzex.com/paper/nodeformer-a-scalable-graph-structure/code)

16 Replies

Loading