Deceptive Risk Minimization: Out-of-Distribution Generalization by Deceiving Distribution Shift Detectors
Keywords: Out-of-distribution generalization, distribution shift detection, conformal martingales
TL;DR: We propose deception as a mechanism for OOD generalization: by learning data representations that make training data appear iid to an observer, we can identify stable features that eliminate spurious correlations and generalize to unseen domains.
Abstract: This paper proposes deception as a mechanism for out-of-distribution (OOD) generalization: by learning data representations that make training data appear independent and identically distributed (iid) to an observer, we can identify stable features that eliminate spurious correlations and generalize to unseen domains. We refer to this principle as deceptive risk minimization (DRM) and instantiate it with a practical differentiable objective that simultaneously learns features that eliminate distribution shifts from the perspective of a detector based on conformal martingales while minimizing a task-specific loss. In contrast to domain adaptation or prior invariant representation learning methods, DRM does not require access to test data or a partitioning of training data into a finite number of data-generating domains. We demonstrate the efficacy of DRM on numerical experiments with concept shift and a simulated imitation learning setting with covariate shift in environments that a robot is deployed in.
Supplementary Material: zip
Primary Area: unsupervised, self-supervised, semi-supervised, and supervised representation learning
Submission Number: 7671
Loading