Calibrated Ensembles: A Simple Way to Mitigate ID-OOD Accuracy Tradeoffs

Ananya Kumar; Aditi Raghunathan; Tengyu Ma; Percy Liang

Calibrated Ensembles: A Simple Way to Mitigate ID-OOD Accuracy Tradeoffs

Ananya Kumar, Aditi Raghunathan, Tengyu Ma, Percy Liang

Published: 02 Dec 2021, Last Modified: 05 May 2023NeurIPS 2021 Workshop DistShift PosterReaders: Everyone

Keywords: calibration, distribution shift, ensembles

TL;DR: Robustness interventions (such as removing spurious correlations) improve OOD accuracy at the cost of decreasing ID accuracy - we show that calibrated ensembles are a simple and effective solution to this problem

Abstract: We often see undesirable tradeoffs in robust machine learning where out-of-distribution (OOD) accuracy is at odds with in-distribution (ID) accuracy. A ‘robust’ classifier obtained via specialized techniques like removing spurious features has better OOD but worse ID accuracy compared to a ‘standard’ classifier trained via vanilla ERM. On six distribution shift datasets, we find that simply ensembling the standard and robust models is a strong baseline---we match the ID accuracy of a standard model with only a small drop in OOD accuracy compared to the robust model. However, calibrating these models in-domain surprisingly improves the OOD accuracy of the ensemble and completely eliminates the tradeoff and we achieve the best of both ID and OOD accuracy over the original models.

1 Reply

Loading