Knowing When Not to Answer: Mitigating Social Bias in LLMs via Epistemic Abstention

TMLR Paper7137 Authors

24 Jan 2026 (modified: 08 Feb 2026)Under review for TMLREveryoneRevisionsBibTeXCC BY 4.0
Abstract: The growing application of Large Language Models (LLMs) to social contexts has led to an increase in unjustifiable social group attributions through their own stereotype-based responses; especially when responding to questions where there is little evidence to support a response or ambiguity to context. The lack of sufficient evidence often leads models to hallucinate socially grounded inferences, undermining fairness and trust. In this work, we attempt to mitigate social bias under ambiguity via epistemic uncertainty. We further introduce BHARATBBQ-R, a rationale-augmented extension of BHARATBBQ that explicitly annotates evidential sufficiency or absence. We propose \textbf{EPIK} (\textbf{E}pistemic \textbf{P}runing under \textbf{I}mplicit \textbf{K}nowledge), an epistemic calibration framework for detecting contextual insufficiency and enforceing principled abstention in case of inadequate evidences. Our framework enforces principled abstention in cases of inadequate evidence, while maintaining the performance for unambiguous cases. Prior bias mitigation technique focuses on suppressing stereotypes or debiasing representations; our proposed framework reframes biased behavior as a failure of epistemic humility. Experiments across five open-source LLMs show that EPIK substantially reduces the bias score for ambiguous contexts (from 1.41–1.52 to 0.86–0.98), while maintaining accuracy on unambiguous instances. From results, we establish the epistemic calibration enables selective suppression of stereotype-driven inference without indiscriminately refusing valid social reasoning.
Submission Type: Regular submission (no more than 12 pages of main content)
Assigned Action Editor: ~Yan_Liu1
Submission Number: 7137
Loading