HubGT: Fast Graph Transformer with Decoupled Hierarchy Labeling

Ningyi Liao; Zihao Yu; Siqiang Luo; Gao Cong

HubGT: Fast Graph Transformer with Decoupled Hierarchy Labeling

Ningyi Liao, Zihao Yu, Siqiang Luo, Gao Cong

Published: 18 Sept 2025, Last Modified: 29 Oct 2025NeurIPS 2025 posterEveryoneRevisionsBibTeXCC BY-NC-ND 4.0

Keywords: Graph neural networks; Graph Transformer; Shortest Path Distance

TL;DR: We propose HubGT as a scalable Graph Transformer enabling fast and hierarchical embeddings.

Abstract: Graph Transformer (GT) has recently emerged as a promising neural network architecture for learning graph-structured data. However, its global attention mechanism with quadratic complexity concerning the graph scale prevents wider application to large graphs. Effectively representing graph information while ensuring learning efficiency remains challenging, as our analysis reveals that current GT designs targeting scalability still suffer from the computational bottleneck related to graph-scale operations. In this work, we tackle the GT scalability issue by proposing HubGT, a scalable Graph Transformer boosted by fully decoupled graph processing and simplified learning. HubGT represents the graph by a novel hierarchical scheme exploiting hub labels, which is shown to be more informative than plain adjacency by offering global connections while promoting locality, and is particularly suitable for handling complex graph patterns such as heterophily. We also design algorithms for efficiently constructing and querying the hub label hierarchy tailored for the GT attention training in scalable deployments. Notably, the precomputation and training processes of HubGT achieve complexities linear to the number of graph edges and nodes, respectively, while the training stage completely removes graph-related computations, leading to favorable mini-batch capability and GPU utilization. Extensive experiments demonstrate that HubGT is efficient in terms of computational enhancement and mini-batch capability over existing GT designs on large-scale benchmarks, while achieving top-tier effectiveness on both homophilous and heterophilous graphs.

Primary Area: Deep learning (e.g., architectures, generative models, optimization for deep networks, foundation models, LLMs)

Submission Number: 10337

Loading