Keywords: retrieval-augmented generation, passage retrieval
Abstract: Retrieval-Augmented Generation (RAG) remains unreliable in specialized domains due to semantic and lexical mismatch between lay queries and professional terminology, and existing generative expansion often introduces redundancy or hallucinations that cause semantic drift. We propose Generative Query Condensation (GQC), a query rewriting strategy that reframes rewriting as semantic condensation rather than expansion. To operationalize GQC, we introduce Query-to-Entity Inference (Q2EI), an entity-centric rewriting method that realizes semantic condensation through explicit inference of the underlying target entity. By moving semantic alignment from retrieval-time vector matching to the rewriting stage, Q2EI produces information-dense query representations. Experimental results on medical and legal benchmarks show that Q2EI consistently outperforms strong baselines across retrievers, improving retrieval effectiveness while substantially reducing rewriting token consumption compared to generative expansion methods. Further analysis confirms that these gains primarily arise from accurate entity inference, and that Q2EI's semantic condensation design limits error amplification when inference is imperfect, leading to more stable and interpretable retrieval behavior.
Paper Type: Long
Research Area: Retrieval-Augmented Language Models
Research Area Keywords: retrieval-augmented generation, passage retrieval
Contribution Types: NLP engineering experiment
Languages Studied: English
Submission Number: 4371
Loading