Self-Supervised Logit AdjustmentDownload PDF

Published: 01 Feb 2023, Last Modified: 13 Feb 2023Submitted to ICLR 2023Readers: Everyone
Keywords: Long-tailed learning, self-supervised learning, logit adjustment, optimal transport
Abstract: Self-supervised learning (SSL) has achieved tremendous success on various well curated datasets in computer vision and natural language processing. Nevertheless, it is hard for existing works to capture transferable and robust features, when facing the long-tailed distribution in the real-world scenarios. The attribution is that plain SSL methods to pursue sample-level uniformity easily leads to the distorted embedding space, where head classes with the huge sample number dominate the feature regime and tail classes passively collapse. To tackle this problem, we propose a novel Self-Supervised Logit Adjustment ($S^2LA$) method to achieve the category-level uniformity from a geometric perspective. Specially, we measure the geometric statistics of the embedding space to construct the calibration, and jointly learn a surrogate label allocation to constrain the space expansion of head classes and avoid the passive collapse of tail classes. Our proposal does not alter the setting of SSL and can be easily integrated into existing works in an end-to-end and low-cost manner. Extensive results on a range of benchmark datasets show the effectiveness of $S^2LA$ with high tolerance to the distribution skewness.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics
Submission Guidelines: Yes
Please Choose The Closest Area That Your Submission Falls Into: Unsupervised and Self-supervised learning
TL;DR: We propose a novel algorithm for self-supervised long-tailed learning, which overcomes the intrinsic limitation of the conventional contrastive learning loss, i.e., sample-level uniformity, and progressively approaches the category-level uniformity.
46 Replies

Loading