Keywords: Information theory, Partial Information Decomposition, Invariant graph representation, OOD generalization, Optimization
Abstract: Learning invariant graph representation for out-of-distribution (OOD) generalization remains challenging because the learnt representations often retain spurious components. To address this challenge, this work brings in a new tool from information theory called Partial Information Decomposition (PID) that goes beyond classical information-theoretic measures. PID specifically disentangles the total information about the target variable $Y$ provided by the invariant subgraph $G_c$ and the spurious subgraph $G_s$ into four non-negative components: unique information (from $G_c$ and $G_s$, respectively), redundant information, and synergistic information. We identify limitations in existing approaches for invariant representation learning that solely rely on classical information-theoretic measures, motivating the need to precisely focus on redundant information about the target $Y$ shared between spurious subgraphs and invariant subgraphs obtained via PID. Next, we propose a new multi-level optimization framework that we call -- Redundancy-guided Invariant Graph learning (*RIG*) -- that also maximizes redundant information while isolating spurious and causal subgraphs, enabling reliable OOD generalization under diverse distribution shifts. Our approach relies on alternating between estimating a lower bound of redundant information (which itself requires an optimization), and maximizing it along with other constraints. Experiments on both synthetic and real-world graph datasets demonstrate the effectiveness and generalization capabilities of RIG.
Submission Number: 102
Loading