Keywords: Causality, Distribution Shift, Dataset Shift, Robustness
TL;DR: We give a method for evaluating predictive performance under parameterized changes in distribution.
Abstract: We give a method for proactively identifying small, plausible shifts in distribution which lead to large differences in model performance. To ensure that these shifts are plausible, we parameterize them in terms of interpretable changes in causal mechanisms of observed variables. This defines a parametric robustness set of plausible distributions and a corresponding worst-case loss. We construct a local approximation to the loss under shift, and show that problem of finding worst-case shifts can be efficiently solved.