EqCo:   Equivalent Rules for Self-supervised Contrastive Learning

Benjin Zhu; Junqiang Huang; Zeming Li; Xiangyu Zhang; Jian Sun

EqCo: Equivalent Rules for Self-supervised Contrastive Learning

Benjin Zhu, Junqiang Huang, Zeming Li, Xiangyu Zhang, Jian Sun

28 Sept 2020 (modified: 12 Oct 2025)ICLR 2021 Conference Blind SubmissionReaders: Everyone

Abstract: In this paper, we propose a method, named EqCo (Equivalent Rules for Contrastive Learning), to make self-supervised learning irrelevant to the number of negative samples in the contrastive learning framework. Inspired by the InfoMax principle, we point that the margin term in contrastive loss needs to be adaptively scaled according to the number of negative pairs in order to keep steady mutual information bound and gradient magnitude. EqCo bridges the performance gap among a wide range of negative sample sizes, so that for the first time, we can use only a few negative pairs (e.g. 16 per query) to perform self-supervised contrastive training on large-scale vision datasets like ImageNet, while with almost no accuracy drop. This is quite a contrast to the widely used large batch training or memory bank mechanism in current practices. Equipped with EqCo, our simplified MoCo (SiMo) achieves comparable accuracy with MoCo v2 on ImageNet (linear evaluation protocol) while only involves 16 negative pairs per query instead of 65536, suggesting that large quantities of negative samples might not be a critical factor in contrastive learning frameworks.

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics

Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 1 code implementation](https://www.catalyzex.com/paper/eqco-equivalent-rules-for-self-supervised/code)

Reviewed Version (pdf): https://openreview.net/references/pdf?id=gQj2libhJO

10 Replies

Loading