Detecting Vision-Language Model Hallucinations before Generation

ACL ARR 2025 July Submission1205 Authors

29 Jul 2025 (modified: 20 Aug 2025)ACL ARR 2025 July SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Abstract: Object hallucination is a significant challenge that undermines the reliability of the Vision Language Model (VLM). Current methods for evaluating hallucination often require computationally expensive complete sequence generation, making rapid assessment or large-scale analysis difficult. We introduce HALP (HALlucination Prediction via Probing), a novel framework to efficiently estimate a VLM's propensity to hallucinate objects without requiring full caption generation. HALP trains a lightweight probe on internal VLM representations extracted after image processing but before autoregressive decoding. HALP offers a new paradigm for efficient evaluation of VLM, a better understanding of how VLMs internally represent information related to grounding and hallucination, and the potential for real-time assessment of hallucination risk.
Paper Type: Short
Research Area: Multimodality and Language Grounding to Vision, Robotics and Beyond
Research Area Keywords: Interpretability, Explainability, Hallucination in VLMs, Vision Language Models, Multimodal AI
Contribution Types: Model analysis & interpretability, NLP engineering experiment, Publicly available software and/or pre-trained models
Languages Studied: English
Previous URL: https://openreview.net/forum?id=uEujseoHgQ
Explanation Of Revisions PDF: pdf
Reassignment Request Area Chair: Yes, I want a different area chair for our submission
Reassignment Request Reviewers: Yes, I want a different set of reviewers
Justification For Not Keeping Action Editor Or Reviewers: New set of reviewers come with fresh perspective and without any bias and knowledge with regards to our previous submission which will help them evaluate this paper with better interest.
A1 Limitations Section: This paper has a limitations section.
A2 Potential Risks: Yes
A2 Elaboration: Ethical Considerations
B Use Or Create Scientific Artifacts: Yes
B1 Cite Creators Of Artifacts: Yes
B1 Elaboration: Yes, all creators of the artifacts used in this work are acknowledged in Section named References, where full citations are provided.
B2 Discuss The License For Artifacts: No
B2 Elaboration: Creative Commons License
B3 Artifact Use Consistent With Intended Use: Yes
B3 Elaboration: Creative Commons License
B4 Data Contains Personally Identifying Info Or Offensive Content: No
B4 Elaboration: Creative Commons License
B5 Documentation Of Artifacts: No
B5 Elaboration: Creative Commons License
B6 Statistics For Data: N/A
C Computational Experiments: Yes
C1 Model Size And Budget: Yes
C1 Elaboration: In Appendix, mentioned parameters but not total computational budget
C2 Experimental Setup And Hyperparameters: Yes
C2 Elaboration: In Appendix
C3 Descriptive Statistics: Yes
C3 Elaboration: In Appendix
C4 Parameters For Packages: No
C4 Elaboration: Use basic settings
D Human Subjects Including Annotators: No
D1 Instructions Given To Participants: N/A
D2 Recruitment And Payment: N/A
D3 Data Consent: N/A
D4 Ethics Review Board Approval: N/A
D5 Characteristics Of Annotators: N/A
E Ai Assistants In Research Or Writing: No
E1 Information About Use Of Ai Assistants: N/A
Author Submission Checklist: yes
Submission Number: 1205
Loading