Abstract: We show that generating English Wikipedia articles can be approached as a multi-
document summarization of source documents. We use extractive summarization
to coarsely identify salient information and a neural abstractive model to generate
the article. For the abstractive model, we introduce a decoder-only architecture
that can scalably attend to very long sequences, much longer than typical encoder-
decoder architectures used in sequence transduction. We show that this model can
generate fluent, coherent multi-sentence paragraphs and even whole Wikipedia
articles. When given reference documents, we show it can extract relevant factual
information as reflected in perplexity, ROUGE scores and human evaluations.
TL;DR: We generate Wikipedia articles abstractively conditioned on source document text.
Keywords: abstractive summarization, Transformer, long sequences, natural language processing, sequence transduction, Wikipedia, extractive summarization
Code: [![github](/images/github_icon.svg) tensorflow/tensor2tensor](https://github.com/tensorflow/tensor2tensor) + [![Papers with Code](/images/pwc_icon.svg) 3 community implementations](https://paperswithcode.com/paper/?openreview=Hyg0vbWC-)
Data: [WikiSum](https://paperswithcode.com/dataset/wikisum), [Wikipedia Generation](https://paperswithcode.com/dataset/wikipedia-generation)
Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 2 code implementations](https://www.catalyzex.com/paper/generating-wikipedia-by-summarizing-long/code)
10 Replies
Loading