Can We Trust LLMs for Medical Diagnosis? Evaluating the Robustness of Clinical Reasoning under Perturbation

ACL ARR 2026 January Submission10575 Authors

06 Jan 2026 (modified: 20 Mar 2026)ACL ARR 2026 January SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Medical Diagnosis, Clinical Reasoning, Robustness
Abstract: Medical large language models (LLMs) have been widely proposed for medical diagnosis with relatively high accuracy. However, existing LLM evaluations often prioritize top-1 accuracy while ignoring the fragility of the reasoning process on realistic clinical notes that are plagued by noise and inconsistent formats. To explore the robustness of reasoning process, this study proposes an adversarial perturbation framework that consists of two strategies: semantic pruning of clinical notes to verify the attention limitation of LLMs, and noise injection to investigate the anti-interference capability of the reasoning process. The experiments are conducted on a realistic dataset, and the results verify the attention limitation and inadequate anti-interference capability of LLMs. These findings reveal the reasoning logic and provide a feasible solution for trustworthy LLMs.
Paper Type: Short
Research Area: NLP Applications
Research Area Keywords: healthcare applications, clinical NLP
Contribution Types: Model analysis & interpretability
Languages Studied: English
Submission Number: 10575
Loading