Abstract: This paper introduces † a systematic and large-scale study of the Vietnamese question generation task. Different from prior work that only investigates the task with a small number (1–2) of datasets, the study reports the performance of question generation models on a wide range of Vietnamese machine reading comprehension corpora in different settings and scenarios. To do that, several methods, from traditional neural networks to pre-trained language models, are implemented. Experimental results on five benchmark datasets show that pre-trained language models, i.e., BARTPho and ViT5 obtain promising results for both automatic and human evaluation. This work leverages the progress of Vietnamese question generation by creating benchmark results for the next studies. The code used for experiments can be also publicly accessible.11https://github.com/Shaun-le/ViQG.git
Loading