Abstract: Retrieval Augmented Generation (RAG) frameworks mitigate hallucinations in Large Language Models (LLMs) by integrating external knowledge, yet face two critical challenges: (1) the distribution gap between user queries and knowledge bases, and (2) incomplete coverage of required knowledge for complex queries. Existing solutions either require task-specific annotations or neglect inherent connections among query, context, and missing knowledge interactions. We propose a Missing Knowledge RAG Framework that synergistically resolves both issues through Chain-of-Thought reasoning. By leveraging open-source LLMs, our method generates structured missing knowledge queries in a single inference pass while aligning query knowledge distributions, and integrates reasoning traces into answer generation. Experiments on open-domain medical and general QA datasets demonstrate significant improvements in context recall and answer accuracy. The framework achieves effective knowledge supplementation without additional training, offering enhanced interpretability and robustness for real-world question answering applications.
Paper Type: Long
Research Area: Generation
Research Area Keywords: retrieval-augmented generation, biomedical QA, open-domain QA
Contribution Types: NLP engineering experiment
Languages Studied: English, Chinese
Submission Number: 6982
Loading