Keywords: View Selection, Contrastive Learning, Self-supervised Learning
Abstract: Contrastive self-supervised learning critically depends on stochastic augmentations to generate positive pairs without quality-guaranteed mechanism.
The potential low-quality positives, inclusive of false positives and trivial positives, hinder models from learning effective representations.
To address these issues, we propose View Selection via 2-Fold Indicators (VS-2FI).
It identifies the low-quality pairs of both types respectively via two indicators before eliminating them.
On the one hand, in order to identify false positives, we introduce Semantic Consistency, and approximate it via the likelihood of two views co-occurring beyond chance.
On the other hand, in order to identify trivial positives, we design Alignment Level, and estimate it by the minimum network depth required to align two views.
VS-2FI discards view pairs that are either low in Semantic Consistency (potential false positives) or low in Alignment Level (potential trivial positives) to improve the overall quality of positive pairs. Extensive experiments elucidate the isolated and integrated effects of the two indicators, and demonstrate the consistent gains of VS-2FI across different contrastive learning frameworks.
Primary Area: unsupervised, self-supervised, semi-supervised, and supervised representation learning
Submission Number: 7046
Loading