DialSummEval: Revisiting Summarization Evaluation for Dialogues

Anonymous

DialSummEval: Revisiting Summarization Evaluation for Dialogues

Anonymous

08 Mar 2022 (modified: 05 May 2023)NAACL 2022 Conference Blind SubmissionReaders: Everyone

Paper Link: https://openreview.net/forum?id=sRmw2pIirH7

Paper Type: Long paper (up to eight pages of content + unlimited references and appendices)

Abstract: Dialogue summarization is receiving increasing attention from researchers due to its extraordinary difficulty and unique application value. We observe that current dialogue summarization models have flaws that may not be well exposed by frequently used metrics such as ROUGE. In our paper, we re-evaluate 18 categories of metrics in terms of four dimensions: coherence, consistency, fluency and relevance, as well as a unified human evaluation of various models for the first time. Some noteworthy trends which are different from the conventional summarization tasks are identified. We will release DialSummEval, a multi-faceted dataset of human judgments containing the outputs of 14 models on SAMSum.

Dataset: zip

Copyright Consent Signature (type Name Or NA If Not Transferrable): Mingqi Gao

Copyright Consent Name And Address: Peking University, No.5 Yiheyuan Road, Haidian District, Beijing 100871, P.R.China

Presentation Mode: This paper will be presented virtually

Virtual Presentation Timezone: UTC+8

0 Replies

Loading