Exploring the impact of dependency length on learnability

Published: 03 Oct 2025, Last Modified: 13 Nov 2025CPL 2025 PosterEveryoneRevisionsBibTeXCC BY 4.0
Keywords: language acquisition, length generalization, data complexity, LSTM
TL;DR: LSTMs fail to show computational support for the "less is more" hypothesis, likely due to fundamental architectural limitations that challenge the analogy between network training and human language acquisition.
Abstract: This study investigates the role of dependency length in the learnability of syntactic structures by Long Short-Term Memory (LSTM) networks, testing the "less is more" hypothesis in language acquisition. Traditionally, this hypothesis posits that children’s limited cognitive resources confer an advantage in language learning. While early computational support relied on artificial grammars and simple recurrent neural networks, conflicting replication results and advancements in neural network architectures prompt a re-evaluation of these claims using more sophisticated models and naturalistic data. Prior to the ubiquity of LLMs, the LSTM architecture was shown to be potentially well-suited for testing psycholinguistic hypotheses. I show that due to the nature of the architecture, LSTMs fail to generalize effectively from short to long dependencies, thereby challenging their suitability as models for human language learning in this context.
Submission Number: 32
Loading