Improving Multi-documents Summarization by Sentence Compression based on Expanded Constituent Parse TreesDownload PDF

2014 (modified: 19 Oct 2020)EMNLP 2014Readers: Everyone
Abstract: In this paper, we focus on the problem of using sentence compression techniques to improve multi-document summarization. We propose an innovative sentence compression method by considering every node in the constituent parse tree and deciding its status – remove or retain. Integer liner programming with discriminative training is used to solve the problem. Under this model, we incorporate various constraints to improve the linguistic quality of the compressed sentences. Then we utilize a pipeline summarization framework where sentences are first compressed by our proposed compression model to obtain top-n candidates and then a sentence selection module is used to generate the final summary. Compared with state-ofthe-art algorithms, our model has similar ROUGE-2 scores but better linguistic quality on TAC data.
0 Replies

Loading