Open Peer Review. Open Publishing. Open Access. Open Discussion. Open Directory. Open Recommendations. Open API. Open Source.
Bayesian Embeddings for Long-Tailed Datasets
Nov 07, 2017 (modified: Nov 07, 2017)ICLR 2018 Conference Blind Submissionreaders: everyoneShow Bibtex
Abstract:The statistics of the real visual world presents a long-tailed distribution: a few classes have significantly more training instances than the remaining classes in a dataset. This is because the real visual world has a few classes that are common while others are rare. Unfortunately, the performance of a convolutional neural network is typically unsatisfactory when trained using a long-tailed dataset. To alleviate this issue, we propose a method that discriminatively learns an embedding in which a simple Bayesian classifier can balance the class-priors to generalize well for rare classes. To this end, the proposed approach uses a Gaussian mixture model to factor out class-likelihoods and class-priors in a long-tailed dataset. The proposed method is simple and easy-to-implement in existing deep learning frameworks. Experiments on publicly available datasets show that the proposed approach improves the performance on classes with few training instances, while maintaining a comparable performance to the state-of-the-art on classes with abundant training examples.
TL;DR:Approach to improve classification accuracy on classes in the tail.
Keywords:Long-tail datasets, Imbalanced datasets
Enter your feedback below and we'll get back to you as soon as possible.