Keywords: Multimodal Large Reasoning Models, Hallucination Mitigation, Reasoning
TL;DR: We propose a lightweight, plug-and-play method for mitigating hallucinations in multimodal large reasoning models (MLRMs) by decomposing perception and reasoning stages and regulating functional attention heads.
Abstract: Multimodal large reasoning models (MLRMs) are rapidly advancing vision-language reasoning and are emerging as a foundation for cross-modal intelligence.
Hallucination remains a persistent failure mode, manifesting itself as erroneous reasoning chains and misinterpretation of visual content.
In this study, we observe that attention heads exhibit a staged division: **shallow** heads predominantly serve perception, while **deeper** heads shift toward symbolic reasoning, revealing two major causes of hallucination, namely perceptual bias and reasoning drift.
To address these issues, we propose a lightweight and interpretable two-step plugin, Functional Head Identification and Class-conditioned Rescaling, which locates perception- and reasoning-oriented heads and regulates their contributions without retraining.
Evaluations on three real-world MLRMs (`Kimi-VL`, `Ocean-R1`, `R1-Onevision`), six benchmarks across three domains, and four baselines show that our plugin achieves an average improvement of **5\%** and up to **15\%**, with **only $<$1\% additional computation** and **9\%** of baseline latency. Our approach is completely model-agnostic and significantly enhances both the reliability and interpretability of the off-the-shelf MLRMs, thereby enabling their safe deployment in high-stakes applications. Our code is available at https://anonymous.4open.science/r/Functional-Attention-Control.
Primary Area: alignment, fairness, safety, privacy, and societal considerations
Submission Number: 7208
Loading