LipKey: A Large-Scale News Dataset with Abstractive Keyphrases and Their Benefits for SummarizationDownload PDF

Anonymous

16 Nov 2021 (modified: 05 May 2023)ACL ARR 2021 November Blind SubmissionReaders: Everyone
Abstract: Summaries, keyphrases, and titles are different ways of concisely capturing the content of a document. While most previous work has addressed them separately, in this work, we jointly use the three elements via multi-task training and training as joint structured inputs, in the context of document summarization. We release LipKey, the largest news corpus with human-written summaries, titles, and keyphrases, as well as being the first large-scale Indonesian keyphrase dataset. We find that including keyphrases and titles as additional context to the source document improves transformer-based summarization models.
0 Replies

Loading