Less can be more in contrastive learning

Jovana Mitrovic; Brian McWilliams; Melanie Rey

Less can be more in contrastive learning

Jovana Mitrovic, Brian McWilliams, Melanie Rey

Published: 09 Dec 2020, Last Modified: 05 May 2023ICBINB 2020 PosterReaders: Everyone

Keywords: Self-supervised Learning, Contrastive Methods

TL;DR: We empirically show that using fewer negatives can boost performance of contrastive methods and provide some potential explanations for this behaviour

Abstract: Unsupervised representation learning provides an attractive alternative to its supervised counterpart because of the abundance of unlabelled data. Contrastive learning has recently emerged as one of the most successful approaches to unsupervised representation learning. Given a datapoint, contrastive learning involves discriminating between a matching, or positive, datapoint and a number of non-matching, or negative, ones. Usually the other datapoints in the batch serve as the negatives for the given datapoint. It has been shown empirically that large batch sizes are needed to achieve good performance, which led the the belief that a large number of negatives is preferable. In order to understand this phenomenon better, in this work investigate the role of negatives in contrastive learning by decoupling the number of negatives from the batch size. Surprisingly, we discover that for a fixed batch size performance actually degrades as the number of negatives is increased. We also show that using fewer negatives can lead to a better signal-to-noise ratio for the model gradients, which could explain the improved performance.

1 Reply

Loading