Keywords: long-tail learning, class imbalance
Abstract: Real-world classification problems typically exhibit an imbalanced or long-tailed label distribution, wherein many labels have only a few associated samples. This poses a challenge for generalisation on such labels, and also makes naive learning biased towards dominant labels. In this paper, we present a statistical framework that unifies and generalises several recent proposals to cope with these challenges. Our framework revisits the classic idea of logit adjustment based on the label frequencies, which encourages a large relative margin between logits of rare positive versus dominant negative labels. This yields two techniques for long-tail learning, where such adjustment is either applied post-hoc to a trained model, or enforced in the loss during training. These techniques are statistically grounded, and practically effective on four real-world datasets with long-tailed label distributions.
One-sentence Summary: Adjusting classifier logits based on class priors, either post-hoc or during training, can improve performance on rare classes.
Supplementary Material: zip
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics
Code: [![github](/images/github_icon.svg) google-research/google-research](https://github.com/google-research/google-research/tree/master/logit_adjustment) + [![Papers with Code](/images/pwc_icon.svg) 2 community implementations](https://paperswithcode.com/paper/?openreview=37nvvqkCo5)
Data: [CIFAR100-LT](https://paperswithcode.com/dataset/cifar100-lt), [ImageNet](https://paperswithcode.com/dataset/imagenet), [ImageNet-LT](https://paperswithcode.com/dataset/imagenet-lt)