LipKey: A Large-Scale News Dataset with Abstractive Keyphrases and Their Benefits for SummarizationDownload PDF

Anonymous

16 Nov 2021 (modified: 05 May 2023)ACL ARR 2021 November Blind SubmissionReaders: Everyone
Abstract: Summaries, keyphrases, and titles are different ways of concisely capturing the content of a document. While most previous work has addressed them separately, in this work, we jointly use the three elements via multi-task training and training as joint structured inputs, in the context of document summarization. We release LipKey, the largest news corpus with human-written summaries, titles, and keyphrases, as well as being the first large-scale Indonesian keyphrase dataset. We find that including keyphrases and titles as additional context to the source document improves transformer-based summarization models.
0 Replies

Loading

OpenReview is a long-term project to advance science through improved peer review with legal nonprofit status. We gratefully acknowledge the support of the OpenReview Sponsors. © 2025 OpenReview