Keywords: Abstractive Summarization, Loss function, Paraphrases
TL;DR: Loss term based on paraphrases for abstractive summarization
Abstract: Fine-tuned models for conditional generation grant state-of-the-art results for abstractive summarization. They achieve high scores by leveraging great amounts of training data and technically "unlimited'' training time coupled with the simple cross-entropy loss. We argue that, similarly to the computer vision domain, the natural language processing tasks should be solved using more complex and task-specific losses. These are more robust and improve the results without increasing the training data, applying augmentation approaches, or increasing the overall number of steps. In this work, we propose a new loss term based on paraphrases of the summaries, coupled with cross-entropy, to train models for abstractive summarization, improving state-of-the-art results without increasing the required timesteps.
Submission Number: 53
Loading