A Methodology for Extrinsic Evaluation of Text Summarization: Does ROUGE Correlate?Open Website

2005 (modified: 16 Jul 2019)IEEvaluation@ACL 2005Readers: Everyone
Abstract: This paper demonstrates the usefulness of summaries in an extrinsic task of relevance judgment based on a new method for measuring agreement, Relevance-Prediction, which compares subjects’ judgments on summaries with their own judgments on full text documents. We demonstrate that, because this measure is more reliable than previous gold-standard measures, we are able to make stronger statistical statements about the benefits of summarization. We found positive correlations between ROUGE scores and two different summary types, where only weak or negative correlations were found using other agreement measures. However, we show that ROUGE may be sensitive to the choice of summarization style. We discuss the importance of these results and the implications for future summarization evaluations.
0 Replies

Loading