Combating Representation Learning Disparity with Geometric Harmonization

Published: 21 Sept 2023, Last Modified: 02 Nov 2023NeurIPS 2023 spotlightEveryoneRevisionsBibTeX
Keywords: Long-tailed learning, self-supervised learning
TL;DR: We propose the novel Geometric Harmonization for self-supervised long-tailed learning, which overcomes the intrinsic limitation of the contrastive learning, i.e., sample-level uniformity, and progressively approaches the category-level uniformity.
Abstract: Self-supervised learning (SSL) as an effective paradigm of representation learning has achieved tremendous success on various curated datasets in diverse scenarios. Nevertheless, when facing the long-tailed distribution in real-world applications, it is still hard for existing methods to capture transferable and robust representation. The attribution is that the vanilla SSL methods that pursue the sample-level uniformity easily leads to representation learning disparity, where head classes with the huge sample number dominate the feature regime but tail classes with the small sample number passively collapse. To address this problem, we propose a novel Geometric Harmonization (GH) method to encourage the category-level uniformity in representation learning, which is more benign to the minority and almost does not hurt the majority under long-tailed distribution. Specially, GH measures the population statistics of the embedding space on top of self-supervised learning, and then infer an fine-grained instance-wise calibration to constrain the space expansion of head classes and avoid the passive collapse of tail classes. Our proposal does not alter the setting of SSL and can be easily integrated into existing methods in a low-cost manner. Extensive results on a range of benchmark datasets show the effectiveness of \methodspace with high tolerance to the distribution skewness.
Supplementary Material: pdf
Submission Number: 11308
Loading