Gated dynamic convolutions with deep layer fusion for abstractive document summarization

Hong-Seok Kwon, Byung-Hyun Go, Juhong Park, Wonkee Lee, Yewon Jeong, Jong-Hyeok Lee

2021 (modified: 16 Jan 2022)Comput. Speech Lang. 2021Readers: Everyone

Abstract: We present a novel abstractive document summarization based on the recently proposed dynamic convolutional encoder-decoder architectures. We address several aspects of summarization that are not well modeled by the basic architecture, by integrating multiple layers of the encoder, controlling information flow in the hierarchy, and exploiting external knowledge. First, we propose a simple and efficient deep layer fusion to extract salient information from the encoder layers. Second, we propose a gating mechanism to control and maintain important contextual information through the encoder-decoder layers into dynamic convolutions. Lastly, we put part-of-speech information into the model as external knowledge to better predict filters for dynamic convolutions. We evaluate our model using ROUGE metrics on three different datasets: CNN-DM, NEWSROOM-ABS, and XSUM. Experimental results show that the proposed model outperforms the state-of-the-art abstractive models on NEWSROOM-ABS and XSUM and shows comparable scores on CNN-DM.

0 Replies