Induction Heads as a Primary Mechanism for Pattern Matching in In-context Learning

ACL ARR 2024 June Submission5184 Authors

16 Jun 2024 (modified: 02 Jul 2024)ACL ARR 2024 June SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Abstract: Large language models (LLMs) have shown a remarkable ability to learn and perform complex tasks through in-context learning (ICL). However, a comprehensive understanding of its internal mechanisms is still lacking. This paper explores the role of induction heads in a few-shot ICL setting. We analyse two state-of-the-art models, Llama-3-8B and InternLM2-20B on abstract pattern recognition and NLP tasks. Our results show that even a minimal ablation of induction heads leads to ICL performance decreases of up to ~32\% for abstract pattern recognition tasks, bringing the performance close to random. For NLP tasks, this ablation substantially decreases the model's ability to benefit from examples, bringing few-shot ICL performance close to that of zero-shot prompts. We further use attention knockout to disable specific induction patterns, and present fine-grained evidence for the role that induction mechanism plays in ICL.
Paper Type: Long
Research Area: Interpretability and Analysis of Models for NLP
Research Area Keywords: prompting, few-shot learning, knowledge tracing/discovering/inducing
Contribution Types: Model analysis & interpretability
Languages Studied: English
Submission Number: 5184
Loading