Information-Theoretic Storage Cost in Sentence Comprehension

Published: 18 May 2026, Last Modified: 18 May 2026CoNLL 2026 ArchivalEveryoneRevisionsBibTeXCC BY 4.0
Keywords: sentence comprehension, information theory, memory, prediction, reading-time analysis
TL;DR: An information-theoretic metric of online processing memory cost is proposed.
Abstract: Real-time sentence comprehension imposes a significant load on working memory, as comprehenders must maintain contextual information to anticipate future input. While measures of such load have played an important role in psycholinguistic theories, they have largely been formalized using symbolic grammars, which assign discrete, uniform costs to syntactic predictions. This study proposes a measure of processing storage cost based on an information-theoretic formalization, as the amount of information previous words carry about future context, under uncertainty. Unlike previous discrete, grammar-based metrics, this measure is continuous, probabilistic, theory-neutral, and can be estimated from pre-trained neural language models. The validity of this approach is demonstrated through three analyses in English: our measure (i) recovers well-known processing asymmetries in center embeddings and relative clauses, (ii) correlates with a grammar-based storage cost in a syntactically-annotated corpus, and (iii) predicts reading-time variance in two large-scale naturalistic datasets over and above baseline models with traditional information-based predictors. Our code is available at https://github.com/kohei-kaji/info-storage.
Scope Confirmation: To the best of my judgment, this submission falls within the scope of CoNLL.
Primary Area Selection: Computational Psycholinguistics, Cognition and Linguistics
Use Of Generative Artificial Intelligence Tools: Yes, for editing/proofreading the manuscript, Yes, for writing code
Data Collection From Human Subjects: No
Submission Type: Archival: I certify that the submission has not been previously published, nor is the material in it under review by another journal or conference. Further, no material in it will be submitted for review at another conference or journal while under review by CoNLL 2026.
Submission Number: 130
Loading