Abstract: We consider syntactic center embedding, where an embedding phrase contains material on both sides of the embedded phrase. While a single center embedding is easily understandable for human language users, this is generally not the case for multiple center embeddings. Despite this, it is a standard view in linguistic theory that multiple center embeddings are grammatically acceptable -- human linguistic competence includes this ability, but this is obscured by performance limitations. We construct sentences with center embeddings of varying levels, ranging from 1-4, and we find that GPT-4 achieves nearly perfect results even with 3 or 4 levels of embeddings. Other LLMs show a sharp drop in accuracy above level 1. We suggest that this is because GPT-4 has successfully learned the same underlying linguistic competence as humans, while not being subject to the same performance limitations. This would mean that human linguistic competence is more clearly observed in GPT-4 than in humans.
Paper Type: Short
Research Area: Linguistic theories, Cognitive Modeling and Psycholinguistics
Research Area Keywords: linguistic theories, cognitive modeling, computational psycholinguistics
Contribution Types: Model analysis & interpretability, Theory
Languages Studied: English
Submission Number: 90
Loading