Rethinking LLM Parametric Knowledge as Confidence for Effective and Efficient RAG

Haoxiang Jin; Ronghan Li; Zixiang Lu; Qiguang Miao

Rethinking LLM Parametric Knowledge as Confidence for Effective and Efficient RAG

Haoxiang Jin, Ronghan Li, Zixiang Lu, Qiguang Miao

18 Sept 2025 (modified: 02 Dec 2025)ICLR 2026 Conference Withdrawn SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Knowledge Boundary, Evaluation, Large Language Models, Retrieval-Augmented Generation, Reranker, Generator

Abstract: Large Language Models (LLMs) tend to generate high-confidence hallucinations when faced with questions beyond their parametric knowledge scope. Retrieval-Augmented Generation (RAG) alleviates this by leveraging external knowledge, but challenges remain as to whether the retrieved context is useful (effective RAG) and whether to retrieve (efficient RAG) when answering specific-domain questions. This challenge underscores the importance of knowledge boundary awareness, which the current methods—relying on discrete labels or limited signals—fail to address adequately, as they overlook the rich information in LLMs’ continuous internal hidden states. To this end, we propose a novel knowledge probing approach for effective and efficient RAG. First, we construct a confidence detection model based on LLMs’ internal hidden states to quantify how retrieved contexts enhance the model’s confidence. Then, we build a preference dataset with the confidence detection model to fine-tune a reranker, enabling it to prioritize contexts preferred by the downstream LLM. Additionally, we introduce CBDR, which adaptively triggers retrieval based on the LLM’s initial confidence in the original question, reducing knowledge conflicts and improving efficiency. Experimental results show that significant improvements have been achieved in the accuracy of both context screening and end-to-end Retrieval-Augmented Generation (RAG) performance. Wherein, when dynamic retrieval is activated, the accuracy of the RAG system increases by 5.6 percentage points (pp); meanwhile, the retrieval cost is significantly reduced by 7.1 pp, thereby substantially enhancing the system's practical utility while maintaining competitive accuracy.

Supplementary Material: zip

Primary Area: foundation or frontier models, including LLMs

Submission Number: 11558

Loading