Keywords: long-tail learning, curriculum learning, llm in vision
TL;DR: Improving performance on long-tail classes by leveraging LLMs to build curricula
Abstract: Real-world datasets often have class imbalance and follow a long-tail distribution, in contrast to curated datasets, such as CIFAR-10/100, MNIST, etc. Learning from long-tail distributed datasets is a challenging problem due to few representative samples from the tail classes, which makes it difficult for the model to learn robust representations. We posit that curriculum learning presents a viable route to iteratively learn good predictive models that better capture predictive signals about rare classes. We propose a simple method to leverage label hierarchies to craft curricula for learning.
For real-world datasets, when the label hierarchy trees are not typically available and manually creating a hierarchy is tedious and expensive, we show that LLMs can be used to compose semantic information about the labels and generate label hierarchies to serve as curricula. We perform a thorough empirical evaluation of our method, showing that across different model architectures (ResNet, ViT, and ConvNext) and on multiple datasets (ImageNet, Places365-LT, iNaturalist, etc), we show that LLMs can be used to generate meaningful hierarchies. Our method improves performance on the long-tail classes and achieves state-of-the-art results on multiple large-scale datasets.
Submission Number: 17
Loading