Abstract: The advent of Large Language Models (LLMs) has significantly eased the process of knowledge acquisition for users through interactive question answering. However, when confronted with domain-specific knowledge, users may struggle to formulate relevant questions aligned with the initial query.
To address this challenge, in this paper, we introduce a novel task termed Knowledge-aware Follow-up Query Generation (KQG), which aims to generate a sequence of follow-up queries with diverse knowledge to aid users in progressive knowledge acquisition from LLMs. To facilitate this task, we first construct a new dataset tailored specifically for KQG, sourced from an online knowledge sharing community. Subsequently, we propose a novel end-to-end training framework for KQG, named ReSAG (Retrieval-then-Selecting Augmented Generation), which extends typical Retrieval-Augmented Generation (RAG) methods by incorporating a selecting policy network. To be specific, KQG comprises three main components: a knowledge retriever, a selecting policy network, and a T5-based question generator. The selecting policy network is meticulously designed to establish a connection between the query generator and the knowledge retriever, fostering interaction between the two components to enhance overall system performance. To train our framework in an end-to-end manner, we introduce a novel variant of policy optimization that integrates neural dense retrieval and selecting into a T5-based sequence-to-sequence generation model, using only ground-truth target output. Finally, extensive experiments demonstrate that our approach significantly outperforms existing state-of-the-art methods, including generative models, RAG models, and LLM-based methods.
Paper Type: Long
Research Area: Information Retrieval and Text Mining
Research Area Keywords: Query generation, Retrieval-augmented generation, Information acquisition
Contribution Types: Data resources
Languages Studied: Chinese
Submission Number: 193
Loading