Analysis of Task Transferability in Large Pre-trained Classifiers

Published: 07 Nov 2023, Last Modified: 13 Dec 2023M3L 2023 PosterEveryoneRevisionsBibTeX
Keywords: Transfer Learning, Distribution Shift
TL;DR: We propose a Task Transfer Analysis for analyzing transferability of performance after linear fine-tuning to downstream tasks in a classification setting.
Abstract: Transfer learning is a cornerstone of modern machine learning, enabling models to transfer the knowledge acquired from a source task to downstream target tasks with minimal fine-tuning. However, the relationship between the source task performance and the downstream target task performance (i.e., transferability) is poorly understood. In this work, we rigorously analyze the transferability of large pre-trained models on downstream classification tasks after linear fine-tuning. We use a novel Task Transfer Analysis approach that transforms the distribution (and classifier) of the source task to produce a new distribution (and classifier) similar to that of the target task. Using this, we propose an upper bound on transferability composed of the Wasserstein distance between the transformed source and the target distributions, the conditional entropy between the label distributions of the two tasks, and the weighted loss of the source classifier on the source task. We propose an optimization problem that minimizes the proposed bound to estimate transferability. Using state-of-the-art pre-trained models, we show that the proposed upper bound accurately estimates transferability on various datasets and demonstrates the importance of high relatedness between the source and target tasks for achieving high transferability.
Submission Number: 25
Loading