TREU : A Trainable Evaluation Metric for Natural Language Rationales

Anonymous

TREU : A Trainable Evaluation Metric for Natural Language Rationales

Anonymous

03 Sept 2022 (modified: 05 May 2023)ACL ARR 2022 September Blind SubmissionReaders: Everyone

Abstract: Explanable AI (XAI) and Natural Language Processing (NLP) researchers often rely on humans to annotate both labels and natural language rationales (explanations) with the goal that models can utilize these rationales to improve model performance, or can generate human-understandable explanations. However, human-annotated rationales are very subjective and could be low-quality, as some recent works discovered. The vital question arises: how can we evaluate the quality of the human-annotated natural language rationales? In this paper, we propose TREU, a trainable evaluation metric that can evaluate the helpfulness of natural language rationales towards models' prediction performances for a wide range of NLP tasks and models with the help of a unified data structure. Our evaluation experiment on five popular datasets with two different model architectures demonstrates that TREU can coherently and faithfully evaluate the quality of rationales among datasets while the Simulatability metric fails. TREU score can also reveal rationale's quality towards specific classes in a multi-class classification task.

Paper Type: long

0 Replies

Loading