Ask Your Distribution Shift if Pre-Training is Right for You

Benjamin Cohen-Wang; Joshua Vendrow; Aleksander Madry

Ask Your Distribution Shift if Pre-Training is Right for You

Benjamin Cohen-Wang, Joshua Vendrow, Aleksander Madry

Published: 01 Jan 2025, Last Modified: 01 Jan 2025Accepted by TMLREveryoneRevisionsBibTeXCC BY 4.0

Abstract: Pre-training is a widely used approach to develop models that are robust to distribution shifts. However, in practice, its effectiveness varies: fine-tuning a pre-trained model improves robustness significantly in some cases but *not at all* in others (compared to training from scratch). In this work, we seek to characterize the failure modes that pre-training *can* and *cannot* address. In particular, we focus on two possible failure modes of models under distribution shift: poor extrapolation (e.g., they cannot generalize to a different domain) and biases in the training data (e.g., they rely on spurious features). Our study suggests that, as a rule of thumb, pre-training can help mitigate poor extrapolation but not dataset biases. After providing theoretical motivation and empirical evidence for this finding, we explore two of its implications for developing robust models: (1) pre-training and interventions designed to prevent exploiting biases have complementary robustness benefits, and (2) fine-tuning on a (very) small, non-diverse but *de-biased* dataset can result in significantly more robust models than fine-tuning on a large and diverse but biased dataset.

Submission Length: Regular submission (no more than 12 pages of main content)

Assigned Action Editor: ~Jeffrey_Pennington1

Submission Number: 2881

Loading