Don't Drop Your Samples! Coherence-Aware Training Benefits Conditional Diffusion

Published: 2024, Last Modified: 07 May 2026CVPR 2024EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Conditional diffusion models are powerful generative models that can leverage various types of conditional information, such as class labels, segmentation masks, or text captions. However, in many real-world scenarios, conditional infor-mation may be noisy or unreliable due to human annotation errors or weak alignment. In this paper, we propose the Coherence-Aware Diffusion (CAD), a novel method that in-tegrates coherence in conditional information into diffusion models, allowing them to learn from noisy annotations with-out discarding data. We assume that each data point has an associated coherence score that reflects the quality of the conditional information. We then condition the diffusion model on both the conditional information and the coherence score. In this way, the model learns to ignore or discount the conditioning when the coherence is low. We show that CAD is theoretically sound and empirically effective on various conditional generation tasks. Moreover, we show that lever-aging coherence generates realistic and diverse samples that respect conditional information better than models trained on cleaned datasets where samples with low coherence have been discarded. Code and weights here.
Loading