The Trade-off between Label Efficiency and Universality of  Representations from Contrastive Learning

Zhenmei Shi; Jiefeng Chen; Kunyang Li; Jayaram Raghuram; Xi Wu; Yingyu Liang; Somesh Jha

The Trade-off between Label Efficiency and Universality of Representations from Contrastive Learning

Zhenmei Shi, Jiefeng Chen, Kunyang Li, Jayaram Raghuram, Xi Wu, Yingyu Liang, Somesh Jha

26 May 2022 (modified: 05 May 2023)ICML 2022 Pre-training WorkshopReaders: Everyone

Keywords: Contrastive Learning, Self-Supervised Learning, Foundation Model, Complexity

TL;DR: We focus on contrastive learning and systematically study a trade-off between label efficiency and universality both empirically and theoretically.

Abstract: The pre-train representation learning paradigm is a recent popular approach to address distribution shift and limitations in training data. This approach first pre-trains a representation function using large unlabeled datasets from multiple tasks by self-supervised (e.g., contrastive) learning, and then learns a simple classifier on the representation using small labeled datasets from the downstream target tasks. The representation should have two key properties: label efficiency (i.e., ability to learn an accurate classifier with a small amount of labeled data) and universality (i.e., usefulness across a wide range of downstream tasks). In this paper, we focus on contrastive learning and systematically study the trade-off between label efficiency and universality both theoretically and empirically. We empirically show that this trade-off exists in different models and datasets. Theoretically, we propose a data model with a hidden representation and provide analysis in a simplified linear setting. Our analysis shows that compared to pre-training on the target task, pre-training on diverse tasks leads to a larger sample complexity for learning the optimal classifier, and thus has worse prediction performance.

0 Replies

Loading