Conflict-Suppressed RAG: A Simple Decoding-Time Framework for Faithful Retrieval-Augmented Generation

Conflict-Suppressed RAG: A Simple Decoding-Time Framework for Faithful Retrieval-Augmented Generation

ACL ARR 2026 January Submission390 Authors

22 Dec 2025 (modified: 20 Mar 2026)ACL ARR 2026 January SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Retrieval-Augmented Generation, Knowledge Conflict, Large Language Models, Faithfulness

Abstract: Retrieval-Augmented Generation (RAG) improves the factual accuracy of large language models (LLMs) by grounding responses in external evidence. However, when retrieved context conflicts with models’ internal parametric knowledge, LLMs may still generate answers that contradict the provided evidence, posing a key challenge to contextual faithfulness and the reliability of RAG systems. Motivated by recent mechanistic findings on the distinct propagation and progressive accumulation of parametric and contextual signals in LLMs, we propose Conflict-Suppressed RAG (CSRAG), a simple, training-free, decoding-time framework for resolving knowledge conflicts. CSRAG biases generation toward retrieved evidence by suppressing tokens associated with parametric knowledge while boosting tokens from the context via two complementary logits processors. Experiments on six challenging faithfulness benchmarks demonstrate that CSRAG consistently achieves state-of-the-art or near state-of-the-art performance across multiple backbone LLMs, while remaining fully training-free and lightweight.

Paper Type: Long

Research Area: Retrieval-Augmented Language Models

Research Area Keywords: retrieval-augmented generation, factuality, inference methods, knowledge-augmented methods, prompting, robustness,

Contribution Types: Model analysis & interpretability, NLP engineering experiment, Approaches low compute settings-efficiency, Publicly available software and/or pre-trained models

Languages Studied: English

Submission Number: 390

Loading