Explanations explained. Influence of free-text explanations on LLMs and the role of implicit knowledge.

ACL ARR 2024 June Submission3293 Authors

15 Jun 2024 (modified: 02 Jul 2024)ACL ARR 2024 June SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Abstract: Despite their remarkable performance, LLMs' ability to provide transparent and faithful explanations for their predictions remains a challenge. We investigate the influence of different types of natural language explanations on LLM predictions, focusing on four different datasets presenting tasks that involve leveraging implicit knowledge. We conduct experiments on three SOTA LLMs on 8 types of explanations, both written by humans or machine-generated, through three generation methods: label-agnostic, label-aware, and counterfactual (label-contradicting) explanation generation. Our results consistently demonstrate that providing explanations significantly improves the accuracy of LLM predictions, even when the models are not explicitly trained to generate explanations, and propose a method to study the relationship between implicitness and explanation effectiveness.
Paper Type: Long
Research Area: Interpretability and Analysis of Models for NLP
Research Area Keywords: explanations, LLMs, implicitness, NLI, causality, explainability
Contribution Types: Model analysis & interpretability, Data resources, Data analysis
Languages Studied: Italian, English
Submission Number: 3293
Loading