Dependency Length, Syntactic Complexity & Memory: A Reading Time Benchmark for Sentence Processing Modeling

Nina Nusbaumer; Corentin Bel; Iria de-Dios-Flores; Guillaume Wisniewski; Benoit Crabbé

Dependency Length, Syntactic Complexity & Memory: A Reading Time Benchmark for Sentence Processing Modeling

Nina Nusbaumer, Corentin Bel, Iria de-Dios-Flores, Guillaume Wisniewski, Benoit Crabbé

Published: 03 Oct 2025, Last Modified: 13 Nov 2025CPL 2025 SpotlightPosterEveryoneRevisionsBibTeXCC BY 4.0

Keywords: sentence processing, dependency length, working memory, benchmark, dataset, model evaluation, generalization

TL;DR: We introduce a controlled English reading-time dataset that systematically varies dependency length and syntactic complexity, enabling modeling of human-like sentence processing and working memory effects in both humans and language models.

Abstract: We present a self-paced reading dataset designed to evaluate language model sensitivity to human-like syntactic difficulty under controlled manipulations of dependency length, syntactic complexity, and working memory load. The corpus includes 360 English sentences (6 conditions × 60 sets), varying in thematic domain and subject number. Reading times were collected from 510 native English speakers using a self-paced reading paradigm, followed by an operation span task to assess working memory capacity. Preliminary mixed-effects analyses show that reading times increase with dependency length and are modulated by syntactic complexity and individual memory capacity. This resource bridges psycholinguistics and NLP by providing a benchmark for modeling human sentence processing mechanisms within a lexically controlled, structurally systematic framework.

Submission Number: 44

Loading