Encoder-decoder Network as Loss Function for Summarization

Sep 25, 2019 Blind Submission readers: everyone Show Bibtex
  • Abstract: We present a new approach to defining a sequence loss function to train a summarizer by using a secondary encoder-decoder as a loss function, alleviating a shortcoming of word level training for sequence outputs. The technique is based on the intuition that if a summary is a good one, it should contain the most essential information from the original article, and therefore should itself be a good input sequence, in lieu of the original, from which a summary can be generated. We present experimental results where we apply this additional loss function to a general abstractive summarizer on a news summarization dataset. The result is an improvement in the ROUGE metric and an especially large improvement in human evaluations, suggesting enhanced performance that is competitive with specialized state-of-the-art models.
  • Code: https://github.com/iclr2020recoder/code_for_paper
  • Keywords: encoder-decoder, summarization, loss functions
  • TL;DR: We present the use of a secondary encoder-decoder as a loss function to help train a summarizer.
  • Original Pdf:  pdf
0 Replies