When Hard Negative Sampling Meets Supervised Contrastive Learning

ZIJUN LONG; George Killick; Richard McCreadie; Gerardo Aragon-Camarasa

When Hard Negative Sampling Meets Supervised Contrastive Learning

ZIJUN LONG, George Killick, Richard McCreadie, Gerardo Aragon-Camarasa

18 Sept 2023 (modified: 11 Feb 2024)Submitted to ICLR 2024EveryoneRevisionsBibTeX

Supplementary Material: zip

Primary Area: unsupervised, self-supervised, semi-supervised, and supervised representation learning

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.

Keywords: Contrastive Learning, Hard negative sampling, Supervised Contrastive Learning, Image Classification

Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.

TL;DR: We improving Supervised Contrastive Learning with Explicit Hard Negative Sampling.

Abstract: State-of-the-art pre-trained image models predominantly follow a two-stage strategy: pre-training on large datasets and fine-tuning on a task-specific labeled dataset with a cross-entropy objective function. However, many studies have shown that cross-entropy can result in sub-optimal generalization and stability. While supervised contrastive learning addresses some limitations of cross-entropy objective function by emphasizing intra-class similarities and inter-class differences, it neglects the importance of hard negative mining. We hypothesize that weighting negative samples by their dissimilarity to positives enhances the efficacy of contrastive learning. This paper introduces a new supervised contrastive learning objective function, named SCHaNe, which incorporates hard negative sampling during the fine-tuning phase. Without requiring specialized architectures, additional data, or extra computational resources, experimental results indicate that SCHaNe outperforms the strong baseline BEiT-3 in Top-1 accuracy across twelve benchmarks, with significant gains of up to 3.32% in few-shot learning settings and 3.41% in full-dataset fine-tuning. Importantly, our proposed objective function sets a new state-of-the-art for base models (parameter size at 88 million) on ImageNet-1k, achieving an accuracy of 86.14%. Furthermore, we demonstrate that the proposed objective function yields better embeddings and explains the improved effectiveness observed in our experiments. Our code is available at https://anonymous.4open.science/r/SCHaNe-61C6/.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Submission Number: 1484

Loading