How do Human Processes AI-generated Hallucination Contents: a Neuroimaging Study

How do Human Processes AI-generated Hallucination Contents: a Neuroimaging Study

ICLR 2026 Conference Submission19028 Authors

19 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: hallucinations, brain signals, neuroscience, multimodal large language model

Abstract: Hallucinations produced by multi-modal large language models (MLLMs) pose considerable risks as it remains unclear to what extent humans can accurately recognize them. To address this issue, this paper explores humans' neural responses to such hallucinated content across varying time scales. We record EEG from 27 participants while they are viewing contents generated by a multi-modal large language models that either include hallucination words or not, and judge whether each description matched an image. The collected EEG data is analyzed based on averaged event related potentials~(ERP) on hallucination vs non-hallucination words. Results suggest that multiple cognitive processes, e.g., semantic integration, inferential processing, memory retrieval, and cognitive load, are engaged during humans' recognition of hallucination content. However, when hallucinations are not recognized by human participants, the brain treats them no differently from non-hallucination content. This indicates that humans already treat such hallucinations the same as non-hallucination content at a subconscious level. Furthermore, we conduct a prediction experiment that uses the collected EEG to detect hallucination contents. This indicates that we can detect whether a user has been deceived by hallucinations generated by MLLMs with their brain activities.

Supplementary Material: pdf

Primary Area: applications to neuroscience & cognitive science

Submission Number: 19028

Loading