Surface Fairness, Deep Bias: Quantifying Epistemic Injustice in the Clinical Reasoning of Large Language Models
Keywords: bias detection, medical application, LLM
Abstract: Large language models (LLMs) have recently achieved remarkable performance on medical benchmarks, leading to their increasing deployment in clinical decision support and patient consultation systems. However, LLMs trained on real-world corpora inevitably inherit latent societal biases, particularly gender biases prevalent in clinical practice, which can perpetuate inequities and threaten patient safety. Existing bias evaluations of LLMs in the medical domain primarily focus on surface-level disparities in the final results, overlooking subtler biases embedded in the models' reasoning processes. To bridge this gap, we propose Clinical Audit for Reasoning Equity ($\textbf{CARE}$), a multi-dimensional evaluation framework designed to detect latent epistemic injustice in LLMs. CARE moves beyond accuracy metrics to audit reasoning trajectories through three complementary lenses: outcome metrics, counterfactual semantic drift and double-stage Chain-of-Thought (CoT) audit. To support this evaluation, we introduce the $\textbf{MedFair-CF}$ Dataset, a strictly controlled counterfactual benchmark comprising 23,096 samples across five clinical specialties, derived from over 500,000 medical records. Our experiments on state-of-the-art (SOTA) LLMs reveal that even when surface-level predictions appear consistent, models exhibit significant semantic biases in multiple dimensions, including diagnostic confidence, symptom attribution, and logical transitions. Crucially, we identify that these implicit biases are not driven by reduced reasoning effort, but rather by the activation of specific stereotype heuristics. These findings provide new insights for guiding the development of more equitable and safer language models.
Paper Type: Long
Research Area: Ethics, Bias, and Fairness
Research Area Keywords: model bias/fairness evaluation
Contribution Types: NLP engineering experiment, Data resources
Languages Studied: English, Chinese
Submission Number: 4494
Loading