Track: long paper (up to 8 pages)
Keywords: Diffusion Models, Image Quality, Likelihood Estimation, Mixture of Experts, Likelihood-Quality Trade-off
TL;DR: We propose a Mixture-of-Experts approach to mitigate the trade-off between image quality and likelihood in diffusion models by merging two pretrained models. Our method outperforms both experts and effectively breaks the trade-off on two benchmarks.
Abstract: Diffusion models have recently emerged as powerful generative models capable of producing highly realistic images. Despite their success, a persistent challenge remains: models that generate high-quality samples often assign poor likelihoods to data, and vice versa. This trade-off arises because perceptual quality depends more on modeling high-noise regions, while likelihood is dominated by sensitivity to low-level image statistics. In this work, we propose a simple yet effective method to overcome this trade-off by merging two pretrained diffusion experts, one focused on perceptual quality and the other on likelihood, within a Mixture-of-Experts framework. Our approach applies the image-quality expert during high noise levels and uses the likelihood expert in low noise levels. Empirically, our merged model consistently improves over both experts: on CIFAR-10, it achieves better likelihood and sample quality than either baseline. On ImageNet32, it matches the likelihood of the likelihood expert while surpassing the image-quality expert in FID, effectively breaking the likelihood–quality trade-off in diffusion models.
Submission Number: 123
Loading