R2DQG: A Quality Meets Diversity Framework for Question Generation over Knowledge Bases

Published: 01 Jan 2025, Last Modified: 08 Oct 2025IJCAI 2025EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: The task of Knowledge-Based Question Generation (KBQG) involves generating natural language questions from structured knowledge sources, posing unique challenges in balancing linguistic diversity and semantic relevance. Existing models often focus on maximizing surface-level similarity to ground-truth questions, neglecting the need for diverse syntactic forms and leading to semantic drift during generation. To overcome these challenges, we propose Refine-Reinforced Diverse Question Generation (R2DQG), a two-phase framework leveraging a generation-then-refinement paradigm. The Generator first constructs a diverse set of expressive templates using dependency parse tree similarity, capturing a wide range of syntactic patterns and styles. These templates guide the creation of question drafts, ensuring both diversity and semantic relevance. In the second phase, a Corrector module refines the drafts to mitigate semantic drift and enhance overall coherence and quality. Experiments on public datasets show that R2DQG outperforms state-of-the-art models in generating diverse, contextually accurate questions. Moreover, synthetic datasets generated by R2DQG enhance downstream QA performance, underscoring the practical utility of our approach.
Loading