Keywords: self-supervised learning, dataset imbalance, representation learning, long-tailed recognition
TL;DR: We discover that self-supervised representations are more robust to class imbalance than supervised representations and explore the underlying cause of this phenomenon.
Abstract: Self-supervised learning (SSL) learns general visual representations without the need of labels. However, large-scale unlabeled datasets in the wild often have long-tailed label distributions, where we know little about the behavior of SSL. We investigate SSL under dataset imbalance, and find out that existing self-supervised representations are more robust to class imbalance than supervised representations. The performance gap between balanced and imbalanced pre-training with SSL is much smaller than the gap with supervised learning. Second, to understand the robustness of SSL, we hypothesize that SSL learns richer features from frequent data: it may learn label-irrelevant-but-transferable features that help classify the rare classes. In contrast, supervised learning has no incentive to learn features irrelevant to the labels of frequent examples. We validate the hypothesis with semi-synthetic experiments and theoretical analysis on a simplified setting.