Keywords: Graph Neural Networks, Foundation Models, Zero-shot Generalization
Abstract: Graph foundation models (GFMs) aim to pretrain graph neural networks (GNNs) that can generalize to new graph datasets in a zero-shot manner, requiring little or no additional training. This goal is challenging because graph data collected from diverse domains often exhibit significantly different node features and topological structures, and standard GNNs are sensitive to such variations. Prior efforts either restrict learning to a single domain (e.g., molecular graphs) or rely on heavy auxiliary pipelines that transform raw features into surrogates using large language models (LLMs) or feature-graph constructions. However, these approaches are often computationally expensive or restricted to specific domains.
In this work, we find that principal component analysis (PCA) is a simple, efficient feature alignment for GFMs. We show that PCA-aligned features satisfy two properties central to zero-shot GFM generalization: (i) equivalence on identical datasets (identical datasets with only feature dimensions permuted or node order permuted always generalize to invariant or equivalent graph representations) and (ii) representation bounded on latently same datasets (non-identical datasets with same latent space always generate graph representations within bounded distance and prediction error). Building on this alignment method, we develop a Mini-GFM framework that is trained once across multiple datasets; at generalizing time, it only requires PCA alignment of the new dataset and optimization of a shallow, task-specific linear head. Across diverse node- and graph-classification benchmarks, this approach delivers competitive zero-shot performance compared with other baselines while using substantially lower preprocessing cost. These theoretical and empirical results validate the sufficiency of PCA alignment.
Primary Area: learning on graphs and other geometries & topologies
Submission Number: 19509
Loading