Abstract: The task of Knowledge-Based Question Generation (KBQG) involves generating natural language questions from structured knowledge sources, posing unique challenges in balancing linguistic diversity and semantic relevance. Existing models often focus on maximizing surface-level similarity to ground-truth questions, neglecting the need for diverse syntactic forms and leading to semantic drift during generation. To overcome these challenges, we propose Refine-Reinforced Diverse Question Generation (R2DQG), a two-phase framework leveraging a generation-then-refinement paradigm. The Generator first constructs a diverse set of expressive templates using dependency parse tree similarity, capturing a wide range of syntactic patterns and styles. These templates guide the creation of question drafts, ensuring both diversity and semantic relevance. In the second phase, a Corrector module refines the drafts to mitigate semantic drift and enhance overall coherence and quality. Experiments on public datasets show that R2DQG outperforms state-of-the-art models in generating diverse, contextually accurate questions. Moreover, synthetic datasets generated by R2DQG enhance downstream QA performance, underscoring the practical utility of our approach.
External IDs:dblp:conf/ijcai/0001YLS0Y25
Loading