Encoder-decoder Network as Loss Function for Summarization

Glen Jeh

Encoder-decoder Network as Loss Function for Summarization

Glen Jeh

25 Sept 2019 (modified: 05 May 2023)ICLR 2020 Conference Blind SubmissionReaders: Everyone

Abstract: We present a new approach to defining a sequence loss function to train a summarizer by using a secondary encoder-decoder as a loss function, alleviating a shortcoming of word level training for sequence outputs. The technique is based on the intuition that if a summary is a good one, it should contain the most essential information from the original article, and therefore should itself be a good input sequence, in lieu of the original, from which a summary can be generated. We present experimental results where we apply this additional loss function to a general abstractive summarizer on a news summarization dataset. The result is an improvement in the ROUGE metric and an especially large improvement in human evaluations, suggesting enhanced performance that is competitive with specialized state-of-the-art models.

Code: https://github.com/iclr2020recoder/code_for_paper

Keywords: encoder-decoder, summarization, loss functions

TL;DR: We present the use of a secondary encoder-decoder as a loss function to help train a summarizer.

Original Pdf: pdf

8 Replies

Loading