Dx-LLM: Two-layer Retrieval-Augmented Multilingual Diagnosis System

ACL ARR 2024 June Submission1646 Authors

14 Jun 2024 (modified: 21 Jul 2024)ACL ARR 2024 June SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Abstract: Automatic diagnosis (AD) represents a pivotal area in healthcare, where patient symptoms are analyzed for disease diagnosis. Traditional approaches depend on extracting features from symptoms and diseases within collected patient cases. However, real-life patient data collection poses challenges, often resulting in incomplete clinical datasets that can lead to misdiagnosis, especially for new diseases or unrecorded symptoms. Recently, retrieval-augmented large language models (RA-LLMs) have shown significant promise in addressing knowledge-intensive Natural Language Processing (NLP) tasks. To mitigate reliance on previously seen data, we propose a two-layer AD system, termed Dx-LLM, leveraging RA-LLMs. Dx-LLM first constructs a disease-symptom knowledge graph from an external dataset of disease symptom descriptions and conducts initial disease filtering to identify potential candidate diseases based on patient symptoms. Subsequently, in the second layer, we utilize the robust language understanding and generation capabilities of LLMs to re-rank these candidates, thereby producing refined diagnostic outcomes. This two-layer approach reduces the computational load on the second-layer LLM by narrowing down the disease candidates in the first layer. Our results demonstrate that Dx-LLM achieves hit@10 scores of 71.41\% and 70.38\% across 1058 diseases in English and Chinese datasets, consistently outperforming state-of-the-art baselines.
Paper Type: Long
Research Area: Dialogue and Interactive Systems
Research Area Keywords: Dialogue and Interactive Systems, NLP Applications
Contribution Types: NLP engineering experiment
Languages Studied: English, Chinese
Submission Number: 1646