RCD: Retrieval-augmented Contextual Decoding for Truthful Generation

RCD: Retrieval-augmented Contextual Decoding for Truthful Generation

ICLR 2026 Conference Submission14641 Authors

19 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Large Language Models, Hallucination Mitigation, Decoding Strategy

TL;DR: We introduce a context-aware decoding method that retrieves logits from a small reference set to improve LLM truthfulness efficiently, achieving consistent gains across multiple QA benchmarks without model retraining.

Abstract: Ensuring truthfulness in large language models (LLMs) remains a critical challenge for reliable text generation. While supervised fine-tuning and reinforcement learning with human feedback have shown promise, they require substantial amount of annotated data and computational resources, limiting scalability. In contrast, decoding-time interventions offer lightweight alternatives without model retraining. However, existing decoding strategies often face issues like prompt sensitivity, limited generalization, or dependence on internal model states. We propose a context-aware adaptive decoding method that leverages a compact reference grounding space, built from \textit{as few as 10 annotated examples} and comprising pairs of context embeddings and next token logits from truthful responses, to enable retrieval-based logit shaping during inference. At each decoding step, our method retrieves top-$N$ semantically similar contexts and aggregates their associated next token logits to modify the LLM’s logits. Across three open-ended question-answering benchmarks, our approach achieves a 2.4\% average improvement on TruthfulQA and further outperforms existing baselines on both Biographies and WikiQA. Experimental results also demonstrate cross-task generalization, with TruthfulQA-derived grounding enhancing biography generation. Our scalable and efficient method requires only a single generation pass, highlighting the potential of context-aware decoding for factual reliability in LLMs.

Supplementary Material: zip

Primary Area: foundation or frontier models, including LLMs

Submission Number: 14641

Loading