Newer is not always better: Rethinking transferability metrics, their peculiarities, stability and performance

Shibal Ibrahim; Natalia Ponomareva; Rahul Mazumder

Newer is not always better: Rethinking transferability metrics, their peculiarities, stability and performance

Shibal Ibrahim, Natalia Ponomareva, Rahul Mazumder

Published: 02 Dec 2021, Last Modified: 04 May 2025NeurIPS 2021 Workshop DistShift PosterReaders: Everyone

Keywords: transferability metrics, fine-tuning, transfer learning, domain adaptation

TL;DR: Improved transferability estimation with transferability measures for fine-tuning especially in small sample regimes.

Abstract: Fine-tuning of large pre-trained image and language models on small customized datasets has become increasingly popular for improved prediction and efficient use of limited resources. Fine-tuning requires identification of best models to transfer-learn from and quantifying transferability prevents expensive re-training on all of the candidate models/tasks pairs. In this paper, we show that the statistical problems with covariance estimation drive the poor performance of H-score [1] — a common baseline for newer metrics — and propose shrinkage-based estimator. This results in up to 80% absolute gain in H-score correlation performance, making it competitive with the state-of-the-art LogME measure by [26]. Our shrinkage-based H-score is 3−55 times faster than LogME. Additionally, we look into a less common setting of target (as opposed to source) task selection. We highlight previously overlooked problems in such settings with different number of labels, class-imbalance ratios etc. for some recent metrics e.g., NCE [24], LEEP [18] that misrepresented them as leading measures. We propose a correction and recommend measuring correlation performance against relative accuracy in such settings. We support our findings with ~65,000 (fine-tuning trials) experiments.

Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 1 code implementation](https://www.catalyzex.com/paper/newer-is-not-always-better-rethinking/code)

1 Reply

Loading