Robust fine-tuning of zero-shot models

Mitchell Wortsman; Gabriel Ilharco; Jong Wook Kim; Mike Li; Simon Kornblith; Rebecca Roelofs; Raphael Gontijo-Lopes; Hanna Hajishirzi; Ali Farhadi; Hongseok Namkoong; Ludwig Schmidt

Robust fine-tuning of zero-shot models

Mitchell Wortsman, Gabriel Ilharco, Jong Wook Kim, Mike Li, Simon Kornblith, Rebecca Roelofs, Raphael Gontijo-Lopes, Hanna Hajishirzi, Ali Farhadi, Hongseok Namkoong, Ludwig Schmidt

Published: 02 Dec 2021, Last Modified: 04 May 2025NeurIPS 2021 Workshop DistShift PosterReaders: Everyone

Keywords: robustness, zero-shot, fine-tuning, CLIP, distribution shift

TL;DR: We propose WiSE-FT, a method for fine-tuning zero-shot models that improves out-of-distribution accuracy.

Abstract: Large pre-trained models such as CLIP or ALIGN offer consistent accuracy across a range of data distributions when performing zero-shot inference (i.e., without fine-tuning on a specific dataset). Although existing fine-tuning approaches substantially improve accuracy in-distribution, they often reduce out-of-distribution robustness. We address this tension by introducing a simple and effective method for improving robustness: ensembling the weights of the zero-shot and fine-tuned models (WiSE-FT). Compared to standard fine-tuning, WiSE-FT provides large accuracy improvements out-of-distribution, while preserving high in-distribution accuracy. On ImageNet (in-distribution) and five derived distribution shifts, WiSE-FT improves out-of-distribution accuracy by 4 to 6 percentage points (pp) over prior work while increasing in-distribution accuracy by 1.6 pp. WiSE-FT achieves similarly large robustness improvements (2 to 23 pp) on a diverse set of six further distribution shifts, and in-distribution accuracy gains of 0.8 to 3.3 pp compared to standard fine-tuning on seven commonly used transfer learning datasets. These improvements come at no additional computational cost during fine-tuning or inference.

Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 2 code implementations](https://www.catalyzex.com/paper/robust-fine-tuning-of-zero-shot-models/code)

1 Reply

Loading