ESCo: Towards Provably Effective and Scalable Contrastive Representation LearningDownload PDF

29 Sept 2021 (modified: 13 Feb 2023)ICLR 2022 Conference Withdrawn SubmissionReaders: Everyone
Keywords: Contrastive Learning, Unsupervised Representation Learning
Abstract: InfoNCE-based contrastive learning models (e.g., MoCo, SimCLR, etc.) have shown inspiring power in unsupervised representation learning by maximizing a tight lower bound of the mutual information of two views' representations. However, its quadratic complexity makes it hard for scaling to larger batch sizes, and some recent research suggests that it may exploit superfluous information that is useless for downstream prediction tasks. In this paper, we propose ESCo (Effective and Scalable Contrastive), a new contrastive framework which is essentially an instantiation of the Information Bottleneck principle under self-supervised learning settings. Specifically, ESCo targets a new objective that seeks to maximize the similarity between the representations of positive pairs and minimize the pair-wise kernel potential of negative pairs, with a provable guarantee of effective representations that preserve task-relevant information and discard the irrelevant one. Furthermore, to escape from the quadratic time complexity and memory cost, we propose to leverage the Random Features to achieve accurate approximation with linear scalability. We show that the vanilla InfoNCE objective is a degenerated case of ESCo, which implies that ESCo can potentially boost existing InfoNCE-based models. To verify our method, we conduct extensive experiments on both synthetic and real-world datasets, showing its superior performance over the InfoNCE-based baselines in (unsupervised) representation learning tasks for images and graphs.
5 Replies

Loading