Learning Weighted Representations for Generalization Across Designs


Nov 03, 2017 (modified: Nov 03, 2017) ICLR 2018 Conference Blind Submission readers: everyone Show Bibtex
  • Abstract: Predictive models that generalize well under distributional shift are often desirable and sometimes crucial to machine learning applications. One example is the estimation of treatment effects from observational data, where a subtask is to predict the effect of a treatment on subjects that are systematically different from those who received the treatment in the data. A related kind of distributional shift appears in unsupervised domain adaptation, where we are tasked with generalizing to a distribution of inputs that is different from the one in which we observe labels. We pose both of these problems as prediction under a shift in design. Popular methods for overcoming distributional shift are often heuristic or rely on assumptions that are rarely true in practice, such as having a well-specified model or knowing the policy that gave rise to the observed data. Other methods are hindered by their need for a pre-specified metric for comparing observations, or by poor asymptotic properties. In this work, we devise a family of algorithms to address these issues, by jointly learning a representation and a re-weighting of observed data. We show that our algorithms minimize an upper bound on the generalization error under design shift, and verify the effectiveness of this approach in causal effect estimation.
  • TL;DR: A theory and algorithmic framework for prediction under distributional shift, including causal effect estimation and domain adaptation
  • Keywords: Distributional shift, causal effects, domain adaptation