Make Lead Bias in Your Favor: A Simple and Effective Method for News Summarization

Chenguang Zhu; Ziyi Yang; Robert Gmyr; Michael Zeng; Xuedong Huang

Make Lead Bias in Your Favor: A Simple and Effective Method for News Summarization

Chenguang Zhu, Ziyi Yang, Robert Gmyr, Michael Zeng, Xuedong Huang

25 Sept 2019 (modified: 05 May 2023)ICLR 2020 Conference Blind SubmissionReaders: Everyone

TL;DR: A method to leverage lead bias in large-scale pretraining for abstractive news summarization

Abstract: Lead bias is a common phenomenon in news summarization, where early parts of an article often contain the most salient information. While many algorithms exploit this fact in summary generation, it has a detrimental effect on teaching the model to discriminate and extract important information. We propose that the lead bias can be leveraged in a simple and effective way in our favor to pretrain abstractive news summarization models on large-scale unlabelled corpus: predicting the leading sentences using the rest of an article. Via careful data cleaning and filtering, our transformer-based pretrained model without any finetuning achieves remarkable results over various news summarization tasks. With further finetuning, our model outperforms many competitive baseline models. For example, the pretrained model without finetuning outperforms pointer-generator network on CNN/DailyMail dataset. The finetuned model obtains 3.2% higher ROUGE-1, 1.6% higher ROUGE-2 and 2.1% higher ROUGE-L scores than the best baseline model on XSum dataset.

Code: https://www.dropbox.com/s/3qbcpfwtzfowzmo/PretrainAbsSum.zip?dl=0

Keywords: Summarization, Pretraining

Original Pdf: pdf

10 Replies

Loading