AIGCs Confuse AI Too: Investigating and Explaining Synthetic Image-induced Hallucinations in Large Vision-Language Models
Abstract: The evolution of Artificial Intelligence Generated Contents (AIGCs) is advancing towards higher quality. The growing interactions with AIGCs present a new challenge to the data-driven AI community: While AI-generated contents have played a crucial role in a wide range of AI models, the potential hidden risks they introduce have not been thoroughly examined. Beyond human-oriented forgery detection, AI-generated content poses potential issues for AI models originally designed to process natural data. In this study, we underscore the exacerbated hallucination phenomena in Large Vision-Language Models (LVLMs) caused by AI-synthetic images. Remarkably, our findings shed light on a consistent AIGC hallucination bias: the object hallucinations induced by synthetic images are characterized by a greater quantity and a more uniform position distribution, even these synthetic images do not manifest unrealistic or additional relevant visual features compared to natural images. Moreover, our investigations on Q-former and Linear projector reveal that synthetic images may present token deviations after visual projection, thereby amplifying the hallucination bias.
Primary Subject Area: [Content] Vision and Language
Relevance To Conference: Despite the prosperity of generative models, the risks and challenges posed by AIGC cannot be overlooked. This paper pioneers an exploration into the impact of synthetic images on hallucination problems during the reasoning process of Large Vision Language Models. Extensive experimental results have confirmed a significant deviation between synthetic image- and natural image-induced hallucination, referring to as the hallucination bias.
Supplementary Material: zip
Submission Number: 4131
Loading