Disentangling Invariant Subgraph via Variance Contrastive Estimation under Distribution Shifts

Published: 01 May 2025, Last Modified: 18 Jun 2025ICML 2025 posterEveryoneRevisionsBibTeXCC BY-NC-SA 4.0
TL;DR: We propose to learn disentangled invariant subgraph via self-supervised contrastive variant subgraph estimation for achieving satisfying OOD generalization.
Abstract: Graph neural networks (GNNs) have achieved remarkable success, yet most are developed under the in-distribution assumption and fail to generalize to out-of-distribution (OOD) environments. To tackle this problem, some graph invariant learning methods aim to learn invariant subgraph against distribution shifts, which heavily rely on predefined or automatically generated environment labels. However, directly annotating or estimating such environment labels from biased graph data is typically impractical or inaccurate for real-world graphs. Consequently, GNNs may become biased toward variant patterns, resulting in poor OOD generalization. In this paper, we propose to learn disentangled invariant subgraph via self-supervised contrastive variant subgraph estimation for achieving satisfactory OOD generalization. Specifically, we first propose a GNN-based invariant subgraph generator to disentangle the invariant and variant subgraphs. Then, we estimate the degree of the spurious correlations by conducting self-supervised contrastive learning on variant subgraphs. Thanks to the accurate identification and estimation of the variant subgraphs, we can capture invariant subgraphs effectively and further eliminate spurious correlations by inverse propensity score reweighting. We provide theoretical analyses to show that our model can disentangle the ground-truth invariant and variant subgraphs for OOD generalization. Extensive experiments demonstrate the superiority of our model over state-of-the-art baselines.
Lay Summary: Graphs are everywhere in the real world, from molecules and recommendation systems to social networks. However, AI models trained on graphs often struggle to make accurate predictions when faced with unfamiliar or biased data. This challenge is commonly referred to as out-of-distribution (OOD) generalization failure. To address this problem, we propose a new approach that helps models focus on the stable, meaningful parts of a graph that reliably influence the outcome, while reducing the impact of misleading patterns that are specific to certain environments. Our method combines subgraph disentanglement, self-supervised contrastive learning, and reweighting techniques to eliminate spurious correlations. As a result, our approach enables AI models to generalize more effectively across diverse types of graph data and remain reliable under distribution shifts. This has important applications in real-world scenarios such as molecular property prediction, recommendation systems, and social network analysis, where data conditions frequently change and stable performance is critical.
Primary Area: General Machine Learning
Keywords: Disentanglement, Graph Neural Network, Distribution Shift
Submission Number: 1902
Loading