JAUNE: Justified And Unified Neural language Evaluation

25 Sep 2019 (modified: 24 Dec 2019)ICLR 2020 Conference Blind SubmissionReaders: Everyone
  • Keywords: NLP, Evaluation Metrics, Summarization, Translation, BLEU, ROUGE, Transformers
  • TL;DR: Introduces JAUNE: a methodology to replace BLEU and ROUGE score with multidimensional, model-based evaluators for assessing summaries
  • Abstract: We review the limitations of BLEU and ROUGE -- the most popular metrics used to assess reference summaries against hypothesis summaries, and introduce JAUNE: a set of criteria for what a good metric should behave like and propose concrete ways to use recent Transformers-based Language Models to assess reference summaries against hypothesis summaries.
