When to Trust Context: Self-Reflective Debates for Context Reliability

Zeqi Zhou; Fang Wu; Shayan Talaei; Haokai Zhao; Cheng Meixin; Tinson Xu; Amin Saberi; Yejin Choi

When to Trust Context: Self-Reflective Debates for Context Reliability

Zeqi Zhou, Fang Wu, Shayan Talaei, Haokai Zhao, Cheng Meixin, Tinson Xu, Amin Saberi, Yejin Choi

Published: 07 Jul 2025, Last Modified: 07 Jul 2025KnowFM @ ACL 2025EveryoneRevisionsBibTeXCC BY 4.0

Keywords: LLM/AI agents, retrieval-augmented generation, robustness

TL;DR: We introduce SR-DCR, a lightweight framework that integrates token-level self-confidence with an asymmetric multi-agent debate to resolve knowledge conflicts between LLM's parametric knowledge and contextual input.

Abstract: Large language models frequently encounter conflicts between their parametric knowledge and contextual input, often resulting in factual inconsistencies or hallucinations. We propose Self-Reflective Debate for Contextual Reliability (SR-DCR), a lightweight framework that integrates token-level self-confidence with an asymmetric multi-agent debate to adjudicate such conflicts. A critic, deprived of context, challenges a defender who argues from the given passage; a judge model evaluates the debate and determines the context's reliability. The final answer is selected by combining the verdict with model confidence. Experiments on the ClashEval benchmark show that SR-DCR consistently improves robustness to misleading context while recovering accuracy on trustworthy inputs, outperforming both classical debate and confidence-only baselines with minimal computational overhead.

Archival Status: Non-archival (not included in proceedings)

Submission Number: 42

Loading