Cross-lingual Transfer for Automatic Question Generation by Learning Interrogative Structures in Target Languages

ACL ARR 2024 April Submission687 Authors

16 Apr 2024 (modified: 02 May 2024)ACL ARR 2024 April SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Abstract: Automatic question generation (QG) is used for various purposes, such as building question answering (QA) corpora, creating educational materials, and developing chatbots. However, despite its significance, the majority of existing datasets primarily focus on English, leaving a notable gap in data availability for other languages. Cross-lingual transfer for QG (XLT-QG) has addressed this concern by enabling the utilization of models trained with source language data in other languages. In this paper, we introduce a straightforward and efficient XLT-QG approach that enables the QG model to learn interrogative structures in the target language during inference. Our model is trained to leverage the interrogative patterns found in the given question exemplars to generate questions, using only English QA data. Experimental results demonstrate that the proposed method surpasses various XLT-QG baselines and achieves comparable performance to GPT-3.5-turbo. Moreover, the synthetic data generated by our models proves beneficial for training multilingual QA models. With significantly fewer parameters compared to large language models and without the need for additional training for new languages, our method offers an effective solution for performing QG and QA tasks across diverse languages.
Paper Type: Long
Research Area: Multilingualism and Cross-Lingual NLP
Research Area Keywords: cross-lingual transfer, question generation
Contribution Types: Approaches to low-resource settings, Approaches low compute settings-efficiency
Languages Studied: English, Bengali, Finnish, Indonesian, Korean, Swahili, Telugu, German, Hindi, Chinese
Submission Number: 687
Loading