Keywords: score augmentation, diffusion models
Abstract: Diffusion models have recently achieved remarkable advances in generative modeling, yet we show that they are still prone to overfitting, especially when trained with limited data. To address this issue, we introduce Score Augmentation (ScoreAug), a data augmentation framework tailored for training diffusion models. Unlike conventional methods that augment clean data, ScoreAug operates directly on noisy data, naturally aligning with the denoising process of diffusion models. Moreover, the denoiser is required to predict the transformed target of the original signal, establishing an equivariant learning objective. This equivariance enables learning of scores across diverse denoising spaces -- a principle we call score augmentation. We provide theoretical analysis of score consistency under general transformations, and empirically validate ScoreAug across CIFAR-10, FFHQ, AFHQv2, and ImageNet, with U-Net and DiT backbones. Results show consistent performance improvements over baselines, effective mitigation of overfitting under varying data scales and model capacities, and stable convergence. Beyond improved generalization, ScoreAug avoids potential data leakage in certain scenarios and can be seamlessly combined with standard augmentation strategies for further gains.
Supplementary Material: zip
Primary Area: generative models
Submission Number: 7551
Loading