Contrast to Divide: self-supervised pre-training for learning with noisy labels

Evgenii Zheltonozhskii; Chaim Baskin; Avi Mendelson; Alex M. Bronstein; Or Litany

Contrast to Divide: self-supervised pre-training for learning with noisy labels

Evgenii Zheltonozhskii, Chaim Baskin, Avi Mendelson, Alex M. Bronstein, Or Litany

28 Sept 2020 (modified: 22 Jun 2025)ICLR 2021 Conference Withdrawn SubmissionReaders: Everyone

Keywords: noisy labels, self-supervised learning, semi-supervised learning, label noise

Abstract: Advances in semi-supervised methods for image classification significantly boosted performance in the learning with noisy labels (LNL) task. Specifically, by discarding the erroneous labels (and keeping the samples), the LNL task becomes a semi-supervised one for which powerful tools exist. Identifying the noisy samples, however, heavily relies on the success of a warm-up stage where standard supervised training is performed using the full (noisy) training set. This stage is sensitive not only to the noise level but also to the choice of hyperparameters. In this paper, we propose to solve this problem by utilizing self-supervised pre-training. Our approach, which we name Contrast to Divide, offers several important advantages. First, by removing the labels altogether, our pre-trained features become agnostic to the labels' amount of noise, allowing accurate noisy separation even under high noise levels. Second, as recently shown, semi-supervised methods significantly benefit from self-supervised pre-training. Moreover, compared with standard pre-training approaches (e.g., supervised training on ImageNet), self-supervised pre-training does not suffer from a domain gap. We demonstrate the effectiveness of the proposed method in various settings with both synthetic and real noise. Our results indicate that Contrast to Divide brings a new state-of-the-art by a significant margin to both CIFAR-10 and CIFAR-100. For example, in the high-noise regime of 90%, we get a boost of more than 27% for CIFAR-100 and more than 17% for CIFAR-10 over the previous state-of-the-art. Moreover, we achieve comparable performance on Clothing-1M without using ImageNet pre-training. Code for reproducing our experiments is available at https://github.com/ContrastToDivide/C2D

One-sentence Summary: Self-supevised learning improves learning with noisy labels, especially for high noise rate

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics

Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 2 code implementations](https://www.catalyzex.com/paper/contrast-to-divide-self-supervised-pre/code)

Reviewed Version (pdf): https://openreview.net/references/pdf?id=1bdGDqF5QN

6 Replies

Loading