DENSE-RAG: Measuring and Improving Context Understanding for Consistent Retrieval-Augmented Generation

18 Sept 2025 (modified: 11 Feb 2026)Submitted to ICLR 2026EveryoneRevisionsBibTeXCC BY 4.0
Keywords: Retrieval-augmented-generation;Large Language Model;Open-book QA
TL;DR: We propose an unsupervised method DENSE to evalaute LLM's understanding of context in RAG, and DENSE-RAG to improve RAG performance on Open book QA task according to the DENSE evaluation.
Abstract: Retrieval-Augmented Generation (RAG) has significantly advanced LLM performance in knowledge-intensive tasks. However, when LLMs misinterpret retrieved content, they often revert to pre-trained parametric knowledge or generate hallucinated responses, undermining RAG effectiveness. In this work we try to explore this problem by proposing DEgree-based uNcertainty with Semantically Equivalent contexts (DENSE), a training-free and model-agnostic method to evaluate LLM understanding of retrieved documents. DENSE constructs semantically equivalent context and introduces a degree-based entropy to quantify response semantic uncertainty. Building on DENSE, we further introduce DENSE-RAG, which includes two training-free DENSE-guided modules: adaptive semantic chunking and iterative context refinement. Experiments on open-book QA datasets show that higher DENSE uncertainty correlates with lower QA performance, validating DENSE as a reliable indicator of LLM understanding measurement. DENSE-RAG also achieves performance competitive with state-of-the-art baselines approaches without introducing additional model or fine-tuning.
Supplementary Material: zip
Primary Area: foundation or frontier models, including LLMs
Submission Number: 12964
Loading