Abstract: Large Language Models (LLMs) have shown impressive performance across numerous tasks but often produce hallucinated or inaccurate responses, reducing their reliability. Retrieval-Augmented Generation (RAG) mitigates this issue by incorporating external knowledge into the generation process, yet the effectiveness of the retrieval depends heavily on the search queries and query rewriting techniques are typically adopted to improve the retrieval quality. However, current rewriting methods rely on indirect feedback or costly direct feedback with annotated labels, limiting their practicality and effectiveness. We introduce DynQR, an annotation-free query rewriting framework that uses uncertainty from the reader LLM to provide direct feedback, effectively bridging the gap between the input queries and the needed knowledge in retrieval. DynQR follows a three-stage approach to train a rewriter that reduces uncertainty in the reader’s responses. Additionally, DynQR employs an active rewriting mechanism and post-verification process to minimize unnecessary rewriting and avoid potential noise. Our experiments on five datasets across three QA tasks show that DynQR consistently outperforms existing baselines.
Paper Type: Long
Research Area: Language Modeling
Research Area Keywords: Query Rewriting, Retrieval Augmented Generation
Contribution Types: NLP engineering experiment
Languages Studied: English
Submission Number: 413
Loading