Generating Wikipedia by Summarizing Long SequencesDownload PDF

15 Feb 2018, 21:29 (modified: 10 Feb 2022, 11:29)ICLR 2018 Conference Blind SubmissionReaders: Everyone
Keywords: abstractive summarization, Transformer, long sequences, natural language processing, sequence transduction, Wikipedia, extractive summarization
TL;DR: We generate Wikipedia articles abstractively conditioned on source document text.
Abstract: We show that generating English Wikipedia articles can be approached as a multi- document summarization of source documents. We use extractive summarization to coarsely identify salient information and a neural abstractive model to generate the article. For the abstractive model, we introduce a decoder-only architecture that can scalably attend to very long sequences, much longer than typical encoder- decoder architectures used in sequence transduction. We show that this model can generate fluent, coherent multi-sentence paragraphs and even whole Wikipedia articles. When given reference documents, we show it can extract relevant factual information as reflected in perplexity, ROUGE scores and human evaluations.
Code: [![github](/images/github_icon.svg) tensorflow/tensor2tensor]( + [![Papers with Code](/images/pwc_icon.svg) 3 community implementations](
Data: [WikiSum](, [Wikipedia Generation](
10 Replies