Missing Knowledge in Retrieval-Augmented Generation: Aligning User Queries with Knowledge Base

Missing Knowledge in Retrieval-Augmented Generation: Aligning User Queries with Knowledge Base

ACL ARR 2025 February Submission6982 Authors

16 Feb 2025 (modified: 09 May 2025)ACL ARR 2025 February SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Abstract: Retrieval Augmented Generation (RAG) frameworks mitigate hallucinations in Large Language Models (LLMs) by integrating external knowledge, yet face two critical challenges: (1) the distribution gap between user queries and knowledge bases, and (2) incomplete coverage of required knowledge for complex queries. Existing solutions either require task-specific annotations or neglect inherent connections among query, context, and missing knowledge interactions. We propose a Missing Knowledge RAG Framework that synergistically resolves both issues through Chain-of-Thought reasoning. By leveraging open-source LLMs, our method generates structured missing knowledge queries in a single inference pass while aligning query knowledge distributions, and integrates reasoning traces into answer generation. Experiments on open-domain medical and general QA datasets demonstrate significant improvements in context recall and answer accuracy. The framework achieves effective knowledge supplementation without additional training, offering enhanced interpretability and robustness for real-world question answering applications.

Paper Type: Long

Research Area: Generation

Research Area Keywords: retrieval-augmented generation, biomedical QA, open-domain QA

Contribution Types: NLP engineering experiment

Languages Studied: English, Chinese

Submission Number: 6982

Loading