Explanations for explainability: towards an annotated corpus

ACL ARR 2024 June Submission2344 Authors

15 Jun 2024 (modified: 02 Jul 2024)ACL ARR 2024 June SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Abstract: Providing good explanations plays a pivotal role in enhancing human understanding. First, we organize explanations into categories based on a framework inspired by scientific and philosophical discussions on the nature of explanations. We then focus on developing retrieval techniques for single-sentence explanations, aiming to lay the groundwork for creating an open-source corpus of scientific articles containing annotations of explanations. A user study was conducted to label 100 sentences according to our classification categories. This collection of annotated examples, balanced with topic-related non-explanatory sentences, was used to refine three large language models (LLMs) via the Cohere API, enabling them to perform (a) semantic search, (b) binary classification and (c) single-label classification. Models (b) and (c) presented results superior to base Llama 3 8B and on par with GPT-4, with model (b) showing balanced results and outperforming GPT-4 by 12\% accuracy.
Paper Type: Long
Research Area: Resources and Evaluation
Research Area Keywords: corpus creation, NLP datasets, evaluation, reproducibility
Contribution Types: Publicly available software and/or pre-trained models, Data resources, Data analysis
Languages Studied: english
Submission Number: 2344