Med-HVL: Automatic Medical Domain Hallucination Evaluation for Large Vision-Language Models

Qianqi Yan; Xuehai He; Xin Eric Wang

Med-HVL: Automatic Medical Domain Hallucination Evaluation for Large Vision-Language Models

Qianqi Yan, Xuehai He, Xin Eric Wang

Published: 29 Feb 2024, Last Modified: 01 Mar 2024AAAI 2024 SSS on Clinical FMsEveryoneRevisionsBibTeXCC BY 4.0

Track: Non-traditional track

Keywords: Large Vision-Language Models, Hallucination, Medical

Abstract: Advancements in Large Vision-Language Models (LVLMs) have made significant progress in integrating visual and textual data. However, their deployment in the medical domain is impeded by critical issues of hallucinations, asking for reliable evaluation metrics and methods. We define two novel metrics: Object Hallucination and Domain Knowledge Hallucination, to quantify the hallucination of LVLMs in the medical domain. We propose a scalable, automated evaluation framework, Med-HVL, to assess and mitigate hallucinations at both object and domain knowledge levels. We reveal a significant presence of hallucinations in the LVLMs, emphasizing the need for domain-specific adaptations and finetuning to enhance their reliability for medical applications.

Presentation And Attendance Policy: I have read and agree with the symposium's policy on behalf of myself and my co-authors.

Ethics Board Approval: No, our research does not involve datasets that need IRB approval or its equivalent.

Data And Code Availability: Yes, we will make data and code available upon acceptance.

Primary Area: Challenges limiting the adoption of modern ML in healthcare

Student First Author: Yes, the primary author of the manuscript is a student.

Submission Number: 41

Loading