Improving Transformer with Sequential Context Representations for Abstractive Text Summarization

Tian Cai, Mengjun Shen, Huailiang Peng, Lei Jiang, Qiong Dai

Published: 2019, Last Modified: 25 Jan 2026NLPCC (1) 2019EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Recent dominant approaches for abstractive text summarization are mainly RNN-based encoder-decoder framework, these methods usually suffer from the poor semantic representations for long sequences. In this paper, we propose a new abstractive summarization model, called RC-Transformer (RCT). The model is not only capable of learning long-term dependencies, but also addresses the inherent shortcoming of Transformer on insensitivity to word order information. We extend the Transformer with an additional RNN-based encoder to capture the sequential context representations. In order to extract salient information effectively, we further construct a convolution module to filter the sequential context with local importance. The experimental results on Gigaword and DUC-2004 datasets show that our proposed model achieves the state-of-the-art performance, even without introducing external information. In addition, our model also owns an advantage in speed over the RNN-based models.