Keywords: long tailed recognition, network calibration, label-aware smoothing, mixup, dataset bias
Abstract: Deep neural networks often perform poorly when training datasets are heavily class-imbalanced. Recently, two-stage methods greatly improve the performances by decoupling representation learning and classifier learning. In this paper, we discover that networks trained on long-tailed datasets are more prone to miscalibrated and over-confident. The two-stage models suffer the same issue as well. We design two novel methods to improve calibration and performance in such scenarios. Motivated by the predicted probability distributions of classes are highly related to the numbers of class instances, we propose a label-aware smoothing to deal with the different degrees of over-confidence for different classes and improve classifier learning. Noting that there is a dataset bias between these two stages because of different samplers, we further propose a shifted batch normalization to solve the dataset bias in the decoupling framework. Through extensive experiments, we also observe that mixup can remedy over-confidence and improve representation learning but has a negative or negligible effect on classifier learning. Our proposed methods set new records on multiple popular long-tailed recognition benchmarks including LT CIFAR 10/100, ImageNet-LT, Places-LT, and iNaturalist 2018.
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics
Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 1 code implementation](https://www.catalyzex.com/paper/arxiv:2104.00466/code)
Reviewed Version (pdf): https://openreview.net/references/pdf?id=-DDF9ymwnI
7 Replies
Loading