Abstract: Text summarization is a classical task in natural language generation, which aims to generate concise summary of the original article. Neural networks based on the Encoder-Decoder architecture have made great progress in recent years in generating abstractive summaries with high fluency. However, due to the randomness of the abstractive model during generation, the summaries risk missing important information in the articles. To address this challenge, this paper proposes a jointly trained text summarization model that combines abstractive and extractive summarization. On the one hand, extractive models have higher ROUGE scores but poorer readability on the other hand, abstractive models can produce a more fluent summary but suffer from the problem of omitting important information in the original text. Therefore, We share the encoder of both models and jointly train both models to obtain a text representation that benefits from regularisation. We also add document level information obtained from an extractive model to the decoder of the abstractive model to improve abstractive summary. Experiments on CNN/Daily Mail dataset, Pubmed dataset and Arxiv dataset demonstrate the effectiveness of the proposed model.
Loading