The Trade-off between Universality and Label Efficiency of Representations from Contrastive Learning

Zhenmei Shi; Jiefeng Chen; Kunyang Li; Jayaram Raghuram; Xi Wu; Yingyu Liang; Somesh Jha

The Trade-off between Universality and Label Efficiency of Representations from Contrastive Learning

Zhenmei Shi, Jiefeng Chen, Kunyang Li, Jayaram Raghuram, Xi Wu, Yingyu Liang, Somesh Jha

Published: 01 Feb 2023, Last Modified: 26 May 2025ICLR 2023 notable top 25%Readers: Everyone

Keywords: Contrastive Learning, Self-Supervised Learning, Foundation Model, Complexity

TL;DR: We focus on contrastive learning and systematically study a trade-off between label efficiency and universality both empirically and theoretically.

Abstract: Pre-training representations (a.k.a. foundation models) has recently become a prevalent learning paradigm, where one first pre-trains a representation using large-scale unlabeled data, and then learns simple predictors on top of the representation using small labeled data from the downstream tasks. There are two key desiderata for the representation: label efficiency (the ability to learn an accurate classifier on top of the representation with a small amount of labeled data) and universality (usefulness across a wide range of downstream tasks). In this paper, we focus on one of the most popular instantiations of this paradigm: contrastive learning with linear probing, i.e., learning a linear predictor on the representation pre-trained by contrastive learning. We show that there exists a trade-off between the two desiderata so that one may not be able to achieve both simultaneously. Specifically, we provide analysis using a theoretical data model and show that, while more diverse pre-training data result in more diverse features for different tasks (improving universality), it puts less emphasis on task-specific features, giving rise to larger sample complexity for down-stream supervised tasks, and thus worse prediction performance. Guided by this analysis, we propose a contrastive regularization method to improve the trade-off. We validate our analysis and method empirically with systematic experiments using real-world datasets and foundation models.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics

Submission Guidelines: Yes

Supplementary Material: zip

Please Choose The Closest Area That Your Submission Falls Into: Unsupervised and Self-supervised learning

Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 1 code implementation](https://www.catalyzex.com/paper/the-trade-off-between-universality-and-label/code)

16 Replies

Loading