T-Measure: A Measure for Model Transferabilty

21 Sept 2023 (modified: 11 Feb 2024)Submitted to ICLR 2024EveryoneRevisionsBibTeX
Primary Area: transfer learning, meta learning, and lifelong learning
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Keywords: Transfer Estimation, Transfer Learning
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.
Abstract: A popular paradigm in AI modeling, including computer vision, natural language processing, and graph modeling, is applying a large pre-trained model that has been fine-tuned for a particular task on novel datasets. However, many such models are published in model repositories, fine-tuned using different types of source data. Consequently, practitioners face the problem of model selection -- choosing the best model for their task from a repository of models. Model performance in a target domain depends on factors including task definition, model architecture, data distribution, and the model transfer method. Previous model selection methods in transfer learning focus on task definition when assessing transferability, and often require a labeled dataset in the target domain. We formulate the transfer problem as label-agnostic model selection, where the goal is to choose the best-performing model on a target domain without access to labeled data. Specifically, we analyze the impact of source domain training data on model transferability. To measure this transferability, we introduce a new type of quantitative measure, the T-Measure, which correlates with the test-time performance of a model on an unlabeled target domain. We propose a T-Measure estimation method which incorporates distributional measures of the source domain's training data instances, the distribution of the target domain's instances, and the base performance of a task-specific model to create a ranking of models. We then adapt previous task-centric transferability measures for data-centric selection and compare them against T-Measure. We thoroughly evaluate the T-Measure performance for 4 tasks and 11 datasets and show its effectiveness in ranking models for model selection compared to baselines.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 4045
Loading