Mem2Mem: Learning to Summarize Long Texts with Memory Compression and Transfer

Jonathan Pilault; Jaehong Park; Christopher Pal

Mem2Mem: Learning to Summarize Long Texts with Memory Compression and Transfer

Jonathan Pilault, Jaehong Park, Christopher Pal

28 Sept 2020 (modified: 05 May 2023)ICLR 2021 Conference Withdrawn SubmissionReaders: Everyone

Keywords: Natural Language Processing, Summarization, Abstractive Summarization, Memory Compression, Hierarchical models

Abstract: We introduce Mem2Mem, a memory-to-memory mechanism for hierarchical recurrent neural network based encoder decoder architectures and we explore its use for abstractive document summarization. Mem2Mem transfers memories via readable/writable external memory modules that augment both the encoder and decoder. Our memory regularization compresses an encoded input article into a more compact set of sentence representations. Most importantly, the memory compression step performs implicit extraction without labels, sidestepping issues with suboptimal ground-truth data and exposure bias of hybrid extractive-abstractive summarization techniques. By allowing the decoder to read/write over the encoded input memory, the model learns to read salient information about the input article while keeping track of what has been generated. Our Mem2Mem approach yields results that are competitive with state of the art transformer based summarization methods, but with 16 times fewer parameters.

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics

One-sentence Summary: We introduce Mem2Mem, a memory-to-memory mechanism for hierarchical recurrent neural network based encoder decoder architectures and we explore its use for abstractive document summarization.

Reviewed Version (pdf): https://openreview.net/references/pdf?id=06CZIF_4Vt

8 Replies

Loading