Learning Low-frequency Patterns with A Pre-trained Document-Grounded Conversation ModelDownload PDF

Anonymous

17 Sept 2021 (modified: 05 May 2023)ACL ARR 2021 September Blind SubmissionReaders: Everyone
Abstract: Owing to its perceived capability in recognizing the high-frequency patterns appeared in the large corpora, the Generative Pre-trained Transformer model (GPT-2) has demonstrated its remarkable performance in the document-grounded dialogue generation. Capturing low-frequency patterns, however, remains a challenging task. Here we consider a possible extension of the GPT-2 model with its improved capability of grasping the low-frequency patterns especially for task-specific dialogues. The extension consists of a semantic-oriented encoder and a GPT-2 decoder, the latter equipped with a knowledge-aware classification. The proposed encoder-decoder framework strengthens the GPT-2 in two task-specific aspects: One is in regard of a suitable way to select, on a semantic level, the crucial information of the dialogue context and the corresponding history knowledge from the documents; The other is in terms of the determination of the suitable time to generate a response with the knowledge from documents. With the enhanced capability to learn not only high-frequency and but also low-frequency patterns, the proposed extension is shown to outperform the state-of-the-art generative models.
0 Replies

Loading