Self Supervised Learning Using Controlled Diffusion Image Augmentation

Judah A Goldfeder; Patrick Minwan Puma; Gabriel Guo; Gabriel Guerra Trigo; Hod Lipson

Self Supervised Learning Using Controlled Diffusion Image Augmentation

Judah A Goldfeder, Patrick Minwan Puma, Gabriel Guo, Gabriel Guerra Trigo, Hod Lipson

Published: 13 Oct 2024, Last Modified: 02 Dec 2024NeurIPS 2024 Workshop SSLEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Diffusion Models, Image Classification, Augmentation

TL;DR: We use controlled diffusion models to generate new high quality input output pairs without additional supervision

Abstract: While synthetic data generated through diffusion models has been shown to improve performance across various tasks, existing approaches face two challenges: the necessity of fine-tuning a diffusion model for a specific dataset is often expensive, and the domain gap between real and synthetic data limits synthetic data's usefulness, especially in fine-grained classification settings. To mitigate these shortcomings, we developed CDaug, a novel approach to data augmentation utilizing controlled diffusion. Instead of utilizing diffusion models to generate wholly new images, we take a self-supervised approach and condition the generated images on existing data, allowing us to create high quality synthetic images/augmentations that capture the semantic priors and underlying structure of the data while infusing meaningful and novel variations with no human intervention. We developed a pipeline that utilizes ControlNet, conditioned on the original data, and captions generated by the multi-modal LLM LLaVA2 to guide the generative process. Our work uses open-source models, does not require fine-tuning, and is modular. We demonstrate improved performance across 7 fine-grained datasets, in both few-shot and full dataset settings, across many architectures.

Submission Number: 51

Loading