MedAttr: Cost-Effective, Diverse Attribution of LLM Responses in Medical Conversation Systems to Reliable Sources
Keywords: Information retrieval, Attribution, Diversity
Abstract: Conversational systems that emulate medical professionals have considerable potential. They can serve as a first point of contact, directing patients to appropriate care units, or assist in interpreting final medical reports. Advances in Machine Learning, particularly with large language models, have played a pivotal role in building such systems capable of reasoning with medical knowledge. However, these models often produce hallucinated content and can present incorrect information with unwarranted confidence. This poses serious risks in critical domains like healthcare and undermines user trust. In this work, we emphasize the importance of attribution, supporting generated responses using reliable source material, and introduce efficient techniques for attributing outputs.
We introduce MEDATTR, a two-stage attribution method that combines embedding-based retrieval with targeted submodular optimization to efficiently select relevant and diverse supporting passages, providing improved attribution quality under a fixed passage budget. This hybrid strategy sets MEDATTR apart from traditional top-K retrieval or re-ranking methods, particularly in medical settings where hallucination risks are high.
Recognizing that multiple responses may lack diversity, we apply submodular function optimization to select a varied set of passages that enhance attribution quality. Evaluations using AutoAIS and disease label matching show marked improvement over baseline methods. Additionally, our techniques are highly efficient and scalable.
Primary Subject Area: Integration of Imaging and Clinical Data
Secondary Subject Area: Integration of Imaging and Clinical Data
Registration Requirement: Yes
Read CFP & Author Instructions: Yes
Originality Policy: Yes
Single-blind & Not Under Review Elsewhere: Yes
LLM Policy: Yes
Submission Number: 153
Loading