Learning Invariant Representations with Missing DataDownload PDF

Published: 09 Feb 2022, Last Modified: 22 Oct 2023CLeaR 2022 PosterReaders: Everyone
Keywords: invariant prediction, invariant representations, spurious correlations, shortcuts, missing data, missingness, inverse-weighted, IPCW, doubly-robust, doubly robust estimator, MMD
TL;DR: Objectives for invariant prediction enforce independencies between models and nuisance variables. These objectives are hard to estimate under missingness. We propose estimators for the MMD under nuisance missingness.
Abstract: Spurious correlations, or *shortcuts*, allow flexible models to predict well during training but poorly on related test populations. Recent work has shown that models that satisfy particular independencies involving the correlation-inducing *nuisance* variable have guarantees on their test performance. However, enforcing such independencies requires nuisances to be observed during training. But nuisances such as demographics or image background labels are often missing. Enforcing independence on just the observed data does not imply independence on the entire population. In this work, we derive the missing-mmd estimator used for invariance objectives under missing nuisances. On simulations and clinical data, missing-mmds enable improvements in test performance similar to those achieved by using fully-observed data.
Supplementary Material: zip
Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 2 code implementations](https://www.catalyzex.com/paper/arxiv:2112.00881/code)
9 Replies

Loading