Investigating Internal Operations for Syntactic Dependencies in Language Models

ACL ARR 2026 January Submission10636 Authors

06 Jan 2026 (modified: 20 Mar 2026)ACL ARR 2026 January SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Language models, mechanistic interpretability, causal intervention, long-distance dependency
Abstract: Prior work has demonstrated that language models encode syntactic structure, but the operations by which they form syntactic dependencies remain poorly understood. This paper investigates the internal procedures underlying long-distance dependency formation using activation patching. Analyzing four dependency types across model sizes, we find that small models rely on broadly similar attention-based heuristics, whereas larger models exhibit differentiated operational pipelines: non-displacement dependencies involve attention-based marking of structurally illicit positions, while displacement dependencies do not. These patterns are robust to dependency length. Our results suggest that increasing model size leads to a human-like distinction between displacement and non-displacement dependencies, implemented via different internal operations.
Paper Type: Long
Research Area: Linguistic theories, Cognitive Modeling and Psycholinguistics
Research Area Keywords: linguistic theories, computational psycholinguistics
Contribution Types: Model analysis & interpretability, Data analysis
Languages Studied: English
Submission Number: 10636
Loading