Retrieving Facts or Reinforcing Beliefs? Detecting and Quantifying Confirmation Bias in RAG Systems

Retrieving Facts or Reinforcing Beliefs? Detecting and Quantifying Confirmation Bias in RAG Systems

ACL ARR 2025 May Submission3305 Authors

19 May 2025 (modified: 03 Jul 2025)ACL ARR 2025 May SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Abstract: Retrieval-Augmented Generation (RAG) models have emerged as powerful tools for information-seeking tasks across domains. However, their reliance on external retrieval mechanisms introduces new pathways for bias that remain underexplored. In this work, we present [benchmark name], a new benchmark designed to systematically evaluate confirmation bias in RAG pipelines. Unlike previous efforts that focused solely on model outputs, our approach decomposes the RAG process to investigate three critical components: (1) the degree to which the retriever introduces biased evidence, (2) how the reranker may further amplify such bias, and (3) to what extent the final generation is steered by the retrieved evidence. We construct 270 adversarial prompt pairs using a red-teaming-inspired approach in the scientific domain, a setting where subtle biases can lead to significant misinformation. By analyzing model responses and their stance alignment with input prompts, we reveal that multiple state-of-the-art RAG systems exhibit confirmation bias among these three stages, with the reranker often reinforcing biases introduced during retrieval. Our benchmark enables fine-grained diagnosis of confirmation bias in RAG pipelines and offers a foundation for developing more robust and fair information-seeking systems.

Paper Type: Long

Research Area: Ethics, Bias, and Fairness

Research Area Keywords: Ethics, Bias, and Fairness,Human-Centered NLP,Resources and Evaluation

Contribution Types: NLP engineering experiment, Data resources

Languages Studied: English

Submission Number: 3305

Loading