Unlocking the Power of Diffusion Models to Rescue Data Insufficiency for Long-Tailed Recognition

19 Sept 2023 (modified: 25 Mar 2024)ICLR 2024 Conference Desk Rejected SubmissionEveryoneRevisionsBibTeX
Keywords: Diffusion model, Long-tailed learning
TL;DR: Unlocking the Power of Diffusion Models to Rescue Data Insufficiency for Long-Tailed Recognition
Abstract: Long-tailed learning aims to tackle the crucial challenge that head classes dominate the training procedure under severe class imbalance in real-world scenarios. Generating diverse samples for tail categories is considered a key factor in addressing the problem of long-tailed recognition. While diffusion models have demonstrated their efficacy in generating high-quality images, adapting these models to long-tailed domains for data augmentation purposes remains an open challenge. This arises from homogenized generated samples due to the lack of diversity in their generative distributions during diffusion training, and limited semantic information stemming from biased classifier boundaries during diffusion sampling. To overcome the aforementioned challenges, we present DiffRC, a diffusion model-based data augmentation framework designed to generate a diverse array of synthetic samples for tail categories, thereby enhancing the overall classification performance. Building upon the properties of the long-tailed problem, we extract rich generative distribution knowledge of head categories and match the pairwise sample diversity between head and tail categories, enabling the target diffusion model to learn diverse generation preserving inter-sample variation during the diffusion training process. Moreover, we incorporate modified feature prototypes, which encode essential semantic information, to guide the sampling procedure and circumvent the biased classifier predictions during diffusion sampling. Our approach surpasses previous data augmentation techniques for long-tailed learning by a considerable margin and achieves state-of-the-art performance.
Primary Area: visualization or interpretation of learned representations
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 1884
Loading