Abstract: In real-world pathology, diagnosis sometimes involves a two-stage reasoning process, an initial differential diagnosis with preliminary evidence, followed by a definitive diagnosis after further examinations. Existing research rarely reflects this workflow, treating diagnosis as a one-turn task. This work explicitly models the diagnostic process in pathology as a continuous two-turn dialogue with large language models (LLMs). To bridge the evidence gap between stages, we propose a Retrieval-Augmented Generation-based Examination Simulation (RAGES) method to simulate follow-up examination results requested in the first dialogue based on existing records and external knowledge. We curate a high-quality training dataset of initial and follow-up consultations and evaluate LLMs in the two-turn consultation across another multilingual dataset. Our experiments show that (1) LLMs significantly improve diagnostic accuracy with additional evidence, (2) our model outperforms or matches larger and reasoning-enhanced baselines, and (3) RAGES generates more plausible results than pure LLM generation.
Paper Type: Long
Research Area: NLP Applications
Research Area Keywords: healthcare applications
Contribution Types: Model analysis & interpretability
Languages Studied: Chinese, English
Submission Number: 1533
Loading