An Fine-grained Interpretability Evaluation Benchmark for Pre-trained Language Models

Anonymous

An Fine-grained Interpretability Evaluation Benchmark for Pre-trained Language Models

Anonymous

03 Sept 2022 (modified: 05 May 2023)ACL ARR 2022 September Blind SubmissionReaders: Everyone

Abstract: While pre-trained language models (PLMs) have brought great improvements in many NLP tasks, there is increasing attention to explore capabilities of PLMs and interpret their predictions. However, existing works usually focus only on a certain capability of PLMs by testing them with some downstream tasks. There is a lack of datasets for directly evaluating the masked word prediction performance and the interpretability of PLMs. To fill in the gap, we propose a novel evaluation benchmark providing with both English and Chinese annotated data. In order to comprehensively evaluate the capabilities of PLMs, it provides evaluation data in five dimensions, i.e., grammar, semantics, factual knowledge, reasoning and computation. In addition, it provides carefully annotated token-level rationales to evaluate the interpretability of PLM predictions. We conduct experiments on several widely-used PLMs. The results show that they perform very poorly in the dimensions of knowledge and computation. And the rationales provided by them to support predictions are less plausible, especially when they are short. We will release this benchmark at \url{http://xyz}, hoping it can facilitate the research progress of PLMs.

Paper Type: long

0 Replies

Loading