Hallucination as an Upper Bound: A New Perspective on Text-to-Image Evaluation

Published: 24 Sept 2025, Last Modified: 07 Nov 2025NeurIPS 2025 Workshop GenProCCEveryoneRevisionsBibTeXCC BY 4.0
Track: Creative demo
Keywords: Hallucination, Text-to-image, Evaluation metrics, Prompt alignment, Object hallucination, Attribute hallucination, Relation hallucination, Model bias, Generative evaluation
TL;DR: We define hallucination in text-to-image diffusion models as a complementary upper-bound evaluation dimension—capturing unintended objects, attributes, or relations introduced beyond the prompt.
Abstract: In language and vision–language models, hallucination is broadly understood as content generated from a model’s prior knowledge or biases rather than from the given input. While this phenomenon has been studied in those domains, it has not been clearly framed for text-to-image (T2I) generative models. Existing evaluations mainly focus on alignment, checking whether prompt-specified elements appear, but overlook what the model generates beyond the prompt. We argue for defining hallucination in T2I as bias-driven deviations and propose a taxonomy with three categories: attribute, relation, and object hallucinations. This framing introduces an upper bound for evaluation and surfaces hidden biases, providing a foundation for richer assessment of T2I models.
Submission Number: 41
Loading