Improving Explainability of Sentence-level Metrics via Edit-level Attribution for Grammatical Error Correction

Improving Explainability of Sentence-level Metrics via Edit-level Attribution for Grammatical Error Correction

ACL ARR 2025 February Submission7153 Authors

16 Feb 2025 (modified: 09 May 2025)ACL ARR 2025 February SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Abstract: Various evaluation metrics have been proposed for Grammatical Error Correction (GEC), but many, particularly reference-free metrics, lack explainability. This lack of explainability hinders researchers from analyzing the strengths and weaknesses of GEC models and limits the ability to provide detailed feedback for users. To address this issue, we propose attributing sentence-level scores to individual edits, providing insight into how specific corrections contribute to the overall performance. For the attribution method, we use Shapley values, from cooperative game theory, to compute the contribution of each edit. Experiments with existing sentence-level metrics demonstrate high consistency across different edit granularities and show approximately 70\% alignment with human evaluations. In addition, we analyze biases in the metrics based on the attribution results, revealing trends such as the tendency to ignore orthographic edits.

Paper Type: Long

Research Area: NLP Applications

Research Area Keywords: GEC

Contribution Types: Model analysis & interpretability

Languages Studied: English

Submission Number: 7153

Loading