Abstract: Data augmentation is crucial in training deep models, preventing them from overfitting to limited data. Recent advances in generative AI, e.g., diffusion models, have enabled more sophisticated augmentation techniques that produce data resembling natural images. We introduce $\texttt{GeNIe}$ a novel augmentation method which leverages a latent diffusion model conditioned on a text prompt to combine two contrasting data points (an image from the source category and a text prompt from the target category) to generate challenging augmentations. To achieve this, we adjust the noise level (equivalently, number of diffusion iterations) to ensure the generated image retains low-level and background features from the source image while representing the target category, resulting in a hard negative sample for the source category. We further automate and enhance $\texttt{GeNIe}$ by adaptively adjusting the noise level selection on a per image basis (coined as $\texttt{GeNIe-Ada}$), leading to further performance improvements. Our extensive experiments, in both few-shot and long-tail distribution settings, demonstrate the effectiveness of our novel augmentation method and its superior performance over the prior art. Our code is available at https://github.com/UCDvision/GeNIe.
Submission Length: Long submission (more than 12 pages of main content)
Code: https://github.com/UCDvision/GeNIe
Supplementary Material: zip
Assigned Action Editor: ~Arash_Mehrjou1
Submission Number: 4117
Loading