MedLayBench-V: A Large-Scale Benchmark for Expert-Lay Semantic Alignment in Medical Vision Language Models
Keywords: Medical Vision-Language Models, Expert-Lay Alignment, Text Simplification, Multimodal Benchmark, Healthcare NLP, Patient-Centered Care
Abstract: Medical Vision-Language Models (Med-VLMs) have achieved expert-level proficiency in interpreting diagnostic imaging.
However, current models are predominantly trained on professional literature, limiting their ability to communicate findings in the lay register required for patient-centered care.
While text-centric research has actively developed resources for simplifying medical jargon, there is a critical absence of large-scale multimodal benchmarks designed to facilitate lay-accessible medical image understanding.
To bridge this resource gap, we introduce MedLayBench-V, the first large-scale multimodal benchmark dedicated to expert-lay semantic alignment.
Unlike naive simplification approaches that risk hallucination, our dataset is constructed via a Structured Concept-Grounded Refinement (SCGR) pipeline.
This method enforces strict semantic equivalence by integrating Unified Medical Language System (UMLS) Concept Unique Identifiers (CUIs) with micro-level entity constraints.
MedLayBench-V provides a verified foundation for training and evaluating next-generation Med-VLMs capable of bridging the communication divide between clinical experts and patients.
Paper Type: Long
Research Area: Resources and Evaluation
Research Area Keywords: Multimodal, Healthcare, Simplification, Dataset, Evaluation
Contribution Types: NLP engineering experiment, Data resources
Languages Studied: English
Submission Number: 10426
Loading