Keywords: Large Language Models, Syntax, Constituency Parsing, Parsing Strategy
Abstract: Transformer-based language models (LMs) are trained purely for next-word prediction, yet they exhibit sensitivity to syntax.
However, little is known about how they internally parse syntactic structures.
Recent work has probed autoregressive LMs via an arc-standard shift-reduce dependency parser, revealing incremental syntactic states in LM representations, but their methodology is limited to a single dependency parsing strategy and fails to give insight into which parsing strategy is most compatible with autoregressive LM representations among many possible parsing strategies.
In this paper, we extend the incremental probing methodology to constituency structures and investigate which parsing strategy best explains the internal parsing process of autoregressive LMs among top-down, bottom-up, and left-corner strategies.
Our empirical results suggest that LMs implicitly learn different parsing strategies for different languages, with top-down being most prevalent in English and left-corner in Japanese.
Paper Type: Short
Research Area: Interpretability and Analysis of Models for NLP
Research Area Keywords: probing
Contribution Types: Model analysis & interpretability
Languages Studied: English, Japanese
Submission Number: 2872
Loading