Toward Preference-Aware Story Evaluation via Ranking, Rating and ReasoningDownload PDF

Anonymous

08 Mar 2022 (modified: 05 May 2023)NAACL 2022 Conference Blind SubmissionReaders: Everyone
Paper Link: https://openreview.net/forum?id=zejCew2cMr0
Paper Type: Long paper (up to eight pages of content + unlimited references and appendices)
Abstract: Existing automatic story evaluation methods place a premium on story coherence, deviating from human preference. We go beyond such restrictions by presenting a more challenging task of \textbf{preference-aware story evaluation}. Given either a machine-generated or a human-written story, the task requires the machine to output a preference score that corresponds to human preference, along with specific ratings and comments for various aspects (e.g., opening, character-shaping). To support this novel task, we introduce a new dataset, namely \textbf{StoR3}, comprising (i) 100k ranked story pairs; and (ii) a set of 46k ratings and comments on various aspects of the story. To move towards preference-aware evaluation, we propose a model using the \textit{upvote count} as the criterion. The experiments show that the scores obtained by our model have a high correlation to human preference. Additionally, we discovered that the combination of aspect ratings and comments improves performance. Our dataset and benchmarks are publicly available to advance the research of story evaluation tasks.
0 Replies

Loading