Multi-source inverse-curriculum-based training for low-resource dialogue generationDownload PDFOpen Website

Published: 01 Jan 2023, Last Modified: 27 Jun 2023Appl. Intell. 2023Readers: Everyone
Abstract: An effective dialogue system needs amount of training data, but the existing training data is insufficient. Although the pre-trained model has made great progress in recent years, which can alleviate the problem of low resource dialogue to a certain extent, the pre-trained model is large and difficult to deploy. How to improve the performance of dialogue model without additional annotation data and decreasing the model volume has become a new challenge. We propose a multi-source data augmentation method for low-resource dialogue generation by utilizing inverse curriculum learning (inverse CL). Firstly, we adopt three data augmentation methods, including round-trip translation, paraphrasing and pre-trained model, to generate augmentation data. Next, we propose a new training strategy based on inverse CL to utilize different augmentation data. Comparing with the baselines, our method comprehensively outperform the baselines on all evaluation metrics, which shows the effectiveness of our proposed training strategy for dialogue generation. To the best of our knowledge, this is the first systematic investigation of data augmentation in the dialogue generation.
0 Replies

Loading