Abstract: Retrieval-augmented generation (RAG) has emerged as a powerful approach for improving the factual accuracy of large language models (LLMs), particularly by mitigating hallucinations, incorporating up-to-date information, and enhancing generalization across domains. However, current RAG methods often suffer from limitations due to their reliance on extended input prompts and a dependency on supervised retrievers for external knowledge access. In this work, we introduce Keys-to-Knowledge (K2K), a novel retrieval framework that shifts the paradigm from external document retrieval to internal, key-based knowledge retrieval within the LLM itself. K2K employs lightweight knowledge infusion to encode essential information directly into the model’s parameter space, enabling the use of its internal key-value memory for retrieval. To improve the quality of query representations, we propose an activation-guided probe construction method. Furthermore, we introduce a cross-attention reranking mechanism to extract diverse and relevant information from the model's enriched internal knowledge. Experimental results on health outcome predictions demonstrate that K2K significantly improves both the efficiency and effectiveness of knowledge-intensive tasks, offering a promising alternative to traditional RAG approaches by eliminating the need for external retrieval pipelines.
Paper Type: Long
Research Area: Information Retrieval and Text Mining
Research Area Keywords: retrival , langauge model, key memory
Languages Studied: english
Keywords: internal knowledge retrival, LLM
Submission Number: 4680
Loading