Calibrated ensembles - a simple way to mitigate ID-OOD accuracy tradeoffsDownload PDF

Published: 28 Jan 2022, Last Modified: 13 Feb 2023ICLR 2022 SubmittedReaders: Everyone
Keywords: distribution shift, calibration, ensembles
Abstract: We often see undesirable tradeoffs in robust machine learning where out-of-distribution (OOD) accuracy is at odds with in-distribution (ID) accuracy. A ‘robust’ classifier obtained via specialized techniques like removing spurious features has better OOD but worse ID accuracy compared to a ‘standard’ classifier trained via vanilla ERM. On six distribution shift datasets, we find that simply ensembling the standard and robust models is a strong baseline---we match the ID accuracy of a standard model with only a small drop in OOD accuracy compared to the robust model. However, calibrating these models in-domain surprisingly improves the OOD accuracy of the ensemble and completely eliminates the tradeoff and we achieve the best of both ID and OOD accuracy over the original models.
One-sentence Summary: Robustness interventions (such as removing spurious correlations) improve OOD accuracy at the cost of decreasing ID accuracy - we show that calibrated ensembles are a simple and effective solution to this problem
22 Replies

Loading