The Role of Pre-training Data in Transfer Learning

Rahim Entezari; Mitchell Wortsman; Olga Saukh; M. Moein Shariatnia; Hanie Sedghi; Ludwig Schmidt

The Role of Pre-training Data in Transfer Learning

Rahim Entezari, Mitchell Wortsman, Olga Saukh, M. Moein Shariatnia, Hanie Sedghi, Ludwig Schmidt

Published: 06 Mar 2023, Last Modified: 06 Jul 2025MRL 2023Readers: Everyone

Keywords: pre-training, transfer learning, data curation, CLIP, supervised learning, self-supervised learning, LAION

TL;DR: We explore the role of pre-training data with respect to its distribution, size, source, and curation method in transfer performance

Abstract: We explore which pre-training dataset should be used to achieve the best transfer learning performance. We investigate the impact of pre-training on the few-shot and full fine-tuning performance using 7 pre-training datasets, and 9 downstream datasets. Through extensive controlled experiments, we find that the choice of the pre-training dataset is essential for the few-shot transfer, but its role decreases as more data is made available for fine-tuning. Additionally, we explore the role of data curation and examine the trade-offs between label noise and the size of the pre-training dataset. We find that using 2000× more pre-training data from LAION can match the performance of supervised ImageNet pre-training.

Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 1 code implementation](https://www.catalyzex.com/paper/the-role-of-pre-training-data-in-transfer/code)

0 Replies

Loading