Seeing the Unseen: How EMoE Unveils Bias in Text-to-Image Diffusion Models

15 Sept 2025 (modified: 17 Nov 2025)ICLR 2026 Conference Withdrawn SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Diffusion Models, Uncertainty Quantification, Text-to-Image
Abstract: Estimating uncertainty in text-to-image diffusion models is challenging due to their massive parameter counts (often exceeding 100M) and operation in complex, high-dimensional spaces with virtually unbounded input domains. We introduce Epistemic Mixture of Experts (EMoE), a framework for efficient estimation of epistemic uncertainty in diffusion models. EMoE leverages pre-trained networks without requiring additional training, enabling direct uncertainty estimation from a prompt. By probing a latent space within the diffusion process, EMoE captures epistemic uncertainty more effectively than existing approaches. Experiments on the COCO dataset demonstrate EMoE’s superior performance. Beyond benchmark gains, EMoE highlights under-sampled languages and geographic regions associated with elevated uncertainty, uncovering hidden biases in training data. Since training data for online diffusion models is rarely made public, this bias-detection capability is especially valuable. Together, these contributions position EMoE as a practical tool for addressing data imbalance and improving inclusivity in AI-generated content.
Supplementary Material: zip
Primary Area: generative models
Submission Number: 6336
Loading