Large Language Models Are Still Misled by Simple Bias Ensembles

ACL ARR 2026 January Submission8574 Authors

06 Jan 2026 (modified: 20 Mar 2026)ACL ARR 2026 January SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: benchmarking, multi-bias ensemble
Abstract: With the evolution of large language models (LLMs), their robustness against individual simple biases has been enhanced. However, we observe that the ensemble of multiple simple biases still exerts a significant adverse impact on LLMs. Given that real-world data samples are typically confounded by a wide range of biases, LLMs tend to exhibit unstable performance when deployed in high-stakes real-world scenarios such as clinical diagnosis and legal document analysis. However, previous benchmarks are constrained to datasets where each sample is manually injected with only one type of bias. To bridge this gap, we propose a multi-bias benchmark where each sample contains multiple types of biases. Experimental results reveal that existing LLMs and debiasing methods perform poorly on this benchmark, highlighting the challenge of eliminating such compounded biases.
Paper Type: Long
Research Area: Resources and Evaluation
Research Area Keywords: benchmarking
Contribution Types: NLP engineering experiment, Data resources
Languages Studied: English
Submission Number: 8574
Loading