Cross-lingual Transfer for Automatic Question Generation by Learning Interrogative Structures in Target Languages

Cross-lingual Transfer for Automatic Question Generation by Learning Interrogative Structures in Target Languages

ACL ARR 2024 April Submission687 Authors

16 Apr 2024 (modified: 02 May 2024)ACL ARR 2024 April SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Abstract: Automatic question generation (QG) is used for various purposes, such as building question answering (QA) corpora, creating educational materials, and developing chatbots. However, despite its significance, the majority of existing datasets primarily focus on English, leaving a notable gap in data availability for other languages. Cross-lingual transfer for QG (XLT-QG) has addressed this concern by enabling the utilization of models trained with source language data in other languages. In this paper, we introduce a straightforward and efficient XLT-QG approach that enables the QG model to learn interrogative structures in the target language during inference. Our model is trained to leverage the interrogative patterns found in the given question exemplars to generate questions, using only English QA data. Experimental results demonstrate that the proposed method surpasses various XLT-QG baselines and achieves comparable performance to GPT-3.5-turbo. Moreover, the synthetic data generated by our models proves beneficial for training multilingual QA models. With significantly fewer parameters compared to large language models and without the need for additional training for new languages, our method offers an effective solution for performing QG and QA tasks across diverse languages.

Paper Type: Long

Research Area: Multilingualism and Cross-Lingual NLP

Research Area Keywords: cross-lingual transfer, question generation

Contribution Types: Approaches to low-resource settings, Approaches low compute settings-efficiency

Languages Studied: English, Bengali, Finnish, Indonesian, Korean, Swahili, Telugu, German, Hindi, Chinese

Submission Number: 687

Loading