Learning Via Imagination: Controlled Diffusion Image Augmentation

Published: 10 Oct 2024, Last Modified: 25 Dec 2024NeurIPS'24 Compositional Learning Workshop PosterEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Diffusion Models, Image Classsification
TL;DR: We use controleld diffusion to "imagine" variations of images for more robust few-shot performance
Abstract: While synthetic data generated through diffusion models has been shown to improve task performance, current approaches face two key challenges: the high cost of fine-tuning diffusion models for specific datasets and the domain gap between real and synthetic data, which limits utility in fine-grained classification. To address these issues, we propose CDaug, a novel compositional approach to data augmentation using controlled diffusion. Instead of generating entirely new images, CDaug conditions generated images on existing data in a self-supervised manner, akin to how humans use imagination to compose new scenarios from existing concepts, leveraging the compositionality of learned representations to infuse meaningful variations. Our pipeline utilizes ControlNet, conditioned on original data and captions generated by the multi-modal LLM LLaVA2, to guide the generative process. By recombining the underlying structure and semantic priors of the data, CDaug achieves high-quality augmentations without fine-tuning. Using open-source models, our modular approach demonstrates improved performance across seven fine-grained datasets in both few-shot and full dataset settings, showing promise for compositional generalization in fine-grained environments.
Submission Number: 41
Loading