Abstract: Retrieval-Augmented Generation (RAG) has proven effective in mitigating hallucinations in Large Language Models (LLMs), particularly for domain-specific question answering tasks. However, policy documents present unique challenges due to their complex structure, professional terminology, and high information density. Traditional embedding-based RAG methods often lose fine-grained information during semantic compression, leading to suboptimal retrieval performance. To address these challenges, we propose a novel Policy-oriented LLM-Enhanced Retrieval and Reranking Framework (P-LRR). Our framework integrates three key innovations: (1) query representation enhancement through LLM-generated hypothetical answers; (2) a multi-agent keyword extraction system for sparse retrieval; and (3) a weighted fusion strategy for multi-route retrieval results. Extensive experiments on the Policy-Corpus dataset demonstrate that P-LRR significantly outperforms baseline methods in retrieval performance, validating its effectiveness for policy document retrieval tasks.
External IDs:dblp:conf/icic/QiYWCY25
Loading