Keywords: contrastive learning, variable similarity
TL;DR: This paper enhances contrastive learning by introducing the variable similarity of augmented views.
Abstract: Contrastive learning has achieved remarkable success in self-supervised learning by pretraining a generalizable feature representation based on the augmentation invariance. Most existing approaches assume that different augmented views of the same instance (i.e., the *positive pairs*) remain semantically invariant. However, the augmentation results with *varying extent* may introduce semantic discrepancies or even content distortion, and thus the conventional (pseudo) supervision from augmentation invariance may lead to misguided learning objectives. In this paper, we propose a novel method called Contrastive Learning with Variable Similarity (CLVS) to accurately characterize the intrinsic similarity relationships between different augmented views. Our method dynamically adjusts the similarity based on the augmentation extent, and it ensures that strongly augmented views are always assigned lower similarity scores than weakly augmented ones. We provide a theoretical analysis to guarantee the effectiveness of the variable similarity in improving model generalizability. Extensive experiments demonstrate the superiority of our approach, achieving gains of 2.1\% on ImageNet-100 and 1.4\% on ImageNet-1k compared with the state-of-the-art methods.
Primary Area: General machine learning (supervised, unsupervised, online, active, etc.)
Submission Number: 11483
Loading