Investigating Neurons and Heads in Transformer-based LLMs for Typographical Errors

Investigating Neurons and Heads in Transformer-based LLMs for Typographical Errors

ACL ARR 2025 May Submission1275 Authors

17 May 2025 (modified: 03 Jul 2025)ACL ARR 2025 May SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Abstract: This paper investigates how LLMs encode inputs with typos. We hypothesize that specific neurons and attention heads recognize typos and fix them internally using local and global contexts. We introduce a method to identify typo neurons and typo heads that work actively when inputs contain typos. Our experimental results suggest the following: 1) LLMs can fix typos with local contexts when the typo neurons in either the early or late layers are activated, even if those in the other are not. 2) Typo neurons in the middle layers are the core of typo-fixing with global contexts. 3) Typo heads fix typos by widely considering the context not focusing on specific tokens. 4) Typo neurons and typo heads work not only for typo-fixing but also for understanding general contexts.

Paper Type: Long

Research Area: Interpretability and Analysis of Models for NLP

Research Area Keywords: robustness

Contribution Types: Model analysis & interpretability

Languages Studied: English

Submission Number: 1275

Loading