CECADA: Cause-Effect Conjunctive Adverb-based Data Augmentation Method in Low-Resource Knowledge-Grounded Dialogue
Abstract: A large body of research has investigated in drawing an interesting and engaging conversation with a user, and one of the effort is incorporating a knowledge in generation. Accordingly, a growing need for knowledge-incorporated dialogue dataset has gained attention. However, coupling a response and a knowledge in a context-specific manner is laborious and challenging, and hence the amount of data collected is often insufficient. In this light, this study proposes a simple but effective data augmentation method by leveraging the linguistic features of cause-effect conjunctive adverbs in a natural language; we reformulate a plain document with a cause-effect conjunctive adverb as a knowledge-grounded dialogue data instance. With the proposed data augmentation technique, we observe a marked gain in generalization of a model in both knowledge selection and knowledge-grounded dialogue generation. In particular, the proposed method demonstrates its effectiveness in a low-resource setting in which dialogue systems generally suffer from.
0 Replies
Loading