Improved Invariant Learning for Node-level Out-of-distribution Generalization on Graphs

23 Sept 2023 (modified: 11 Feb 2024)Submitted to ICLR 2024EveryoneRevisionsBibTeX
Primary Area: unsupervised, self-supervised, semi-supervised, and supervised representation learning
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Keywords: Machine Learning, Out-of-distribution Generalization, Graph Machine Learning
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.
Abstract: Enhancing OOD generalization on graph data is a recent hot research topic. Among this, node-level OOD generalization remains an underexplored and challenging subject. The difficulty of node-level OOD tasks lies in the fact that representations between nodes are coupled through edges, making it difficult go characterize distribution shifts and capture invariant features. Furthermore, environment labels for nodes is typically expensive to obtain in practice, rendering invariant learning strategies based on environment partitioning infeasible. By establishing a theoretical model, we highlight that even with ground-truth environment partitioning, classical invariant learning methods like IRM and VREx designed for independently distributed training data will still capture spurious features when the depth of the GNN exceeds the width of a node's causal pattern (i.e., the invariant and predictive neighboring subgraph). Intriguingly, however, we theoretically and empirically find that by enforcing Cross-environment Intra-class Alignment (CIA) of node representations, we can remove the reliance on these spurious features. To harness the advantages of CIA and adapt it on graphs, we further propose Localized Reweighting CIA (LoRe-CIA), which does not require environment labels or intricate environment partitioning processes. Leveraging the neighbouring structural information of graphs, LoRe-CIA adaptively select node pairs that exhibit large differences in spurious features but minimal differences in causal features for alignment, enabling more efficient elimination of spurious features. The experiments on GOOD benchmark shows that LoRe-CIA achieves optimal OOD generalization performance on average.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.
Supplementary Material: pdf
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 7627
Loading