Learning to Prioritize: Precision-Driven Sentence Filtering for Long Text Summarization

Anonymous

Learning to Prioritize: Precision-Driven Sentence Filtering for Long Text Summarization

Anonymous

16 Oct 2021 (modified: 05 May 2023)ACL ARR 2021 October Blind SubmissionReaders: Everyone

Abstract: Neural text summarization has shown great potential in recent years. However, current state-of-the-art summarization models are limited by their maximum input length, posing a challenge to summarize longer texts comprehensively. As part of a layered summarization architecture, we introduce PureText, a simple yet effective precision-driven sentence filtering layer that learns to remove low-quality sentences in texts to improve existing summarization models. When evaluated on popular datasets like WikiHow and Reddit TIFU, we show up to 3 and 8 point Rouge-1 absolute improvement on the full test set and the long article subset, respectively, for state-of-the-art summarization models such as BertSum and Bart. Our approach provides downstream models with higher-quality sentences for summarization, improving overall model performance, especially on long text articles.

0 Replies

Loading