Analysis of Task Transferability in Large Pre-trained Classifiers

18 Sept 2023 (modified: 25 Mar 2024)ICLR 2024 Conference Withdrawn SubmissionEveryoneRevisionsBibTeX
Keywords: Transfer Learning, Distribution Shift
TL;DR: We propose a Task Transfer Analysis for analyzing transferability of performance after linear fine-tuning to downstream tasks in a classification setting.
Abstract: Transfer learning is a cornerstone of modern machine learning, enabling models to transfer the knowledge acquired from a source task to downstream target tasks with minimal fine-tuning. However, the relationship between the source task performance and the downstream target task performance (i.e., transferability) is not well understood. In this work, we rigorously analyze the transferability of large pre-trained models on downstream classification tasks after linear fine-tuning. We use a novel Task Transfer Analysis approach that transforms the distribution (and classifier) of the source task to produce a new distribution (and classifier) similar to that of the target task. Using this, we propose an upper bound on transferability composed of the Wasserstein distance between the transformed source and the target distributions, the conditional entropy between the label distributions of the two tasks, and the weighted loss of the source classifier on the source task. We propose an optimization problem that minimizes the proposed bound to estimate transferability. Using state-of-the-art pre-trained models, we show that the proposed upper bound accurately estimates transferability on various datasets and demonstrates the importance of high relatedness between the source and target tasks for achieving high transferability.
Supplementary Material: zip
Primary Area: transfer learning, meta learning, and lifelong learning
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 1465
Loading