Abstract: Retrieval-Augmented Generation (RAG) models have emerged as powerful tools for information-seeking tasks across domains. However, their reliance on external retrieval mechanisms introduces new pathways for bias that remain underexplored. In this work, we present [benchmark name], a new benchmark designed to systematically evaluate confirmation bias in RAG pipelines. Unlike previous efforts that focused solely on model outputs, our approach decomposes the RAG process to investigate three critical components: (1) the degree to which the retriever introduces biased evidence, (2) how the reranker may further amplify such bias, and (3) to what extent the final generation is steered by the retrieved evidence. We construct 270 adversarial prompt pairs using a red-teaming-inspired approach in the scientific domain, a setting where subtle biases can lead to significant misinformation. By analyzing model responses and their stance alignment with input prompts, we reveal that multiple state-of-the-art RAG systems exhibit confirmation bias among these three stages, with the reranker often reinforcing biases introduced during retrieval. Our benchmark enables fine-grained diagnosis of confirmation bias in RAG pipelines and offers a foundation for developing more robust and fair information-seeking systems.
Paper Type: Long
Research Area: Ethics, Bias, and Fairness
Research Area Keywords: Ethics, Bias, and Fairness,Human-Centered NLP,Resources and Evaluation
Contribution Types: NLP engineering experiment, Data resources
Languages Studied: English
Submission Number: 3305
Loading