Can We Trust LLMs for Medical Diagnosis? Evaluating the Robustness of Clinical Reasoning under Perturbation

Can We Trust LLMs for Medical Diagnosis? Evaluating the Robustness of Clinical Reasoning under Perturbation

ACL ARR 2026 January Submission10575 Authors

06 Jan 2026 (modified: 20 Mar 2026)ACL ARR 2026 January SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Medical Diagnosis, Clinical Reasoning, Robustness

Abstract: Medical large language models (LLMs) have been widely proposed for medical diagnosis with relatively high accuracy. However, existing LLM evaluations often prioritize top-1 accuracy while ignoring the fragility of the reasoning process on realistic clinical notes that are plagued by noise and inconsistent formats. To explore the robustness of reasoning process, this study proposes an adversarial perturbation framework that consists of two strategies: semantic pruning of clinical notes to verify the attention limitation of LLMs, and noise injection to investigate the anti-interference capability of the reasoning process. The experiments are conducted on a realistic dataset, and the results verify the attention limitation and inadequate anti-interference capability of LLMs. These findings reveal the reasoning logic and provide a feasible solution for trustworthy LLMs.

Paper Type: Short

Research Area: NLP Applications

Research Area Keywords: healthcare applications, clinical NLP

Contribution Types: Model analysis & interpretability

Languages Studied: English

Submission Number: 10575

Loading