Detecting Vision-Language Model Hallucinations before Generation

Detecting Vision-Language Model Hallucinations before Generation

ACL ARR 2025 July Submission1205 Authors

29 Jul 2025 (modified: 20 Aug 2025)ACL ARR 2025 July SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Abstract: Object hallucination is a significant challenge that undermines the reliability of the Vision Language Model (VLM). Current methods for evaluating hallucination often require computationally expensive complete sequence generation, making rapid assessment or large-scale analysis difficult. We introduce HALP (HALlucination Prediction via Probing), a novel framework to efficiently estimate a VLM's propensity to hallucinate objects without requiring full caption generation. HALP trains a lightweight probe on internal VLM representations extracted after image processing but before autoregressive decoding. HALP offers a new paradigm for efficient evaluation of VLM, a better understanding of how VLMs internally represent information related to grounding and hallucination, and the potential for real-time assessment of hallucination risk.

Paper Type: Short

Research Area: Multimodality and Language Grounding to Vision, Robotics and Beyond

Research Area Keywords: Interpretability, Explainability, Hallucination in VLMs, Vision Language Models, Multimodal AI

Contribution Types: Model analysis & interpretability, NLP engineering experiment, Publicly available software and/or pre-trained models

Languages Studied: English

Previous URL: https://openreview.net/forum?id=uEujseoHgQ

Explanation Of Revisions PDF: pdf

Reassignment Request Area Chair: Yes, I want a different area chair for our submission

Reassignment Request Reviewers: Yes, I want a different set of reviewers

Justification For Not Keeping Action Editor Or Reviewers: New set of reviewers come with fresh perspective and without any bias and knowledge with regards to our previous submission which will help them evaluate this paper with better interest.

A1 Limitations Section: This paper has a limitations section.

A2 Potential Risks: Yes

A2 Elaboration: Ethical Considerations

B Use Or Create Scientific Artifacts: Yes

B1 Cite Creators Of Artifacts: Yes

B1 Elaboration: Yes, all creators of the artifacts used in this work are acknowledged in Section named References, where full citations are provided.

B2 Discuss The License For Artifacts: No

B2 Elaboration: Creative Commons License

B3 Artifact Use Consistent With Intended Use: Yes

B3 Elaboration: Creative Commons License

B4 Data Contains Personally Identifying Info Or Offensive Content: No

B4 Elaboration: Creative Commons License

B5 Documentation Of Artifacts: No

B5 Elaboration: Creative Commons License

B6 Statistics For Data: N/A

C Computational Experiments: Yes

C1 Model Size And Budget: Yes

C1 Elaboration: In Appendix, mentioned parameters but not total computational budget

C2 Experimental Setup And Hyperparameters: Yes

C2 Elaboration: In Appendix

C3 Descriptive Statistics: Yes

C3 Elaboration: In Appendix

C4 Parameters For Packages: No

C4 Elaboration: Use basic settings

D Human Subjects Including Annotators: No

D1 Instructions Given To Participants: N/A

D2 Recruitment And Payment: N/A

D3 Data Consent: N/A

D4 Ethics Review Board Approval: N/A

D5 Characteristics Of Annotators: N/A

E Ai Assistants In Research Or Writing: No

E1 Information About Use Of Ai Assistants: N/A

Author Submission Checklist: yes

Submission Number: 1205

Loading