- Original Pdf: pdf
- Keywords: NLP, Evaluation Metrics, Summarization, Translation, BLEU, ROUGE, Transformers
- TL;DR: Introduces JAUNE: a methodology to replace BLEU and ROUGE score with multidimensional, model-based evaluators for assessing summaries
- Abstract: We review the limitations of BLEU and ROUGE -- the most popular metrics used to assess reference summaries against hypothesis summaries, and introduce JAUNE: a set of criteria for what a good metric should behave like and propose concrete ways to use recent Transformers-based Language Models to assess reference summaries against hypothesis summaries.
7 Replies
Loading