Robust Generalization despite Distribution Shift via Minimum Discriminating Information

Tobias Sutter; Andreas Krause; Daniel Kuhn

Robust Generalization despite Distribution Shift via Minimum Discriminating Information

Tobias Sutter, Andreas Krause, Daniel Kuhn

Published: 09 Nov 2021, Last Modified: 26 May 2025NeurIPS 2021 PosterReaders: Everyone

Keywords: Distributionally robust optimization, distribution shift, stochastic programming, large deviations

Abstract: Training models that perform well under distribution shifts is a central challenge in machine learning. In this paper, we introduce a modeling framework where, in addition to training data, we have partial structural knowledge of the shifted test distribution. We employ the principle of minimum discriminating information to embed the available prior knowledge, and use distributionally robust optimization to account for uncertainty due to the limited samples. By leveraging large deviation results, we obtain explicit generalization bounds with respect to the unknown shifted distribution. Lastly, we demonstrate the versatility of our framework by demonstrating it on two rather distinct applications: (1) training classifiers on systematically biased data and (2) off-policy evaluation in Markov Decision Processes.

Code Of Conduct: I certify that all co-authors of this work have read and commit to adhering to the NeurIPS Statement on Ethics, Fairness, Inclusivity, and Code of Conduct.

Supplementary Material: pdf

TL;DR: Robust Generalization despite Distribution Shift via Minimum Discriminating Information

Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 1 code implementation](https://www.catalyzex.com/paper/robust-generalization-despite-distribution/code)

13 Replies

Loading