The Paradox of Robustness: Decoupling Rule-Based Logic from Affective Noise in High-Stakes Decision-Making

TMLR Paper7569 Authors

18 Feb 2026 (modified: 21 Feb 2026)Under review for TMLREveryoneRevisionsBibTeXCC BY 4.0
Abstract: While Large Language Models (LLMs) are widely documented to be sensitive to minor prompt perturbations and prone to sycophantic alignment, their robustness in consequential, rule-bound decision-making remains under-explored. We uncover a striking "Paradox of Robustness": despite their known lexical brittleness, instruction-tuned LLMs exhibit near-total invariance to emotional framing effects. Using a controlled perturbation framework across three high-stakes domains (healthcare, finance, and education), we find a negligible effect size (Cohen's h = 0.003) compared to the substantial biases observed in analogous human contexts (h in [0.3, 0.8])--approximately two orders of magnitude smaller. This invariance persists across eight models with diverse training paradigms, suggesting the mechanisms driving sycophancy and prompt sensitivity do not translate to failures in logical constraint satisfaction. While LLMs may be "brittle" to how a query is formatted, they are notably "stable" against why a decision should be biased. We release a benchmark (9 base scenarios x 18 condition variants = 162 unique prompts), code, and data to facilitate reproducible evaluation.
Submission Type: Long submission (more than 12 pages of main content)
Assigned Action Editor: ~Yang_Zhang15
Submission Number: 7569
Loading