Collocational bootstrapping: A hypothesis about the learning of subject-verb agreement in humans and neural networks

Published: 18 May 2026, Last Modified: 18 May 2026CoNLL 2026 ArchivalEveryoneRevisionsBibTeXCC BY 4.0
Keywords: subject-verb agreement, bootstrapping, neural networks
TL;DR: Particular levels of variability in word co-occurrence can support the acquisition of subject-verb agreement.
Abstract: In what ways might statistical signals in linguistic input assist with the acquisition of syntax? Here we hypothesize a mechanism called collocational bootstrapping, in which regularities in word co-occurrence patterns can provide cues to syntactic dependencies. We investigate whether this mechanism can support the acquisition of English subject-verb agreement. First, we simulate language acquisition by training neural networks on synthetic datasets that vary in how predictable their subject-verb pairings are. We find that there is a range of variability levels at which these statistical learners robustly learn subject-verb agreement. We then analyze the variability of subject-verb pairings in child-directed language, and we find that the variability in such data falls within the range that supported robust generalization in our computational simulations. Taken together, these results suggest that collocational bootstrapping is a viable learning strategy for the type of input that children receive.
Scope Confirmation: To the best of my judgment, this submission falls within the scope of CoNLL.
Primary Area Selection: Language Acquisition, Learning, Emergence, and Evolution
Secondary Area Selection: Computational Psycholinguistics, Cognition and Linguistics, Syntax and Morphology
Use Of Generative Artificial Intelligence Tools: Yes, for editing/proofreading the manuscript, Yes, for writing code
Data Collection From Human Subjects: No
Submission Type: Archival: I certify that the submission has not been previously published, nor is the material in it under review by another journal or conference. Further, no material in it will be submitted for review at another conference or journal while under review by CoNLL 2026.
Submission Number: 158
Loading