Bayesian Invariance Modeling of Multi-Environment Data

Published: 06 Mar 2025, Last Modified: 01 May 2025SCSL @ ICLR 2025EveryoneRevisionsBibTeXCC BY 4.0
Track: regular paper (up to 6 pages)
Keywords: invariance; multi-environment; bayesian modeling; variational inference
TL;DR: Develop a scalable bayesian model for invariance modeling of multi-environment data
Abstract:

Peters et al. (2016) introduced the problem of invariant modeling. In this problem, we observe feature/outcome data from multiple environments and our goal is to identify a set of invariant features, those that maintain a stable predictive relationship with the outcome. Identifying such features is important for robust generalization to new environments and for uncovering causal mechanisms. While previous methods primarily tackle this problem through hypothesis testing or regularized optimization, we take a Bayesian approach. We develop a probabilistic model of multi-environment data where the indices of the invariant features are encoded as a latent variable. Under the data-generating assumptions as Peters et al. (2016), we show that posterior inference in our model targets the true invariant features. We prove that this posterior is consistent and we provide theoretical results about the posterior contraction rate. In particular, we show that, under a certain metric, greater heterogeneity among environments leads to a faster contraction of the posterior. When the number of features is large, we design an efficient variational inference algorithm to approximate the posterior. In both simulations and real-world data, we show that Bayesian invariance is more accurate and scalable than existing approaches.

Anonymization: This submission has been anonymized for double-blind review via the removal of identifying information such as names, affiliations, and identifying URLs.
Submission Number: 45
Loading