Neural Correlates of Language Models Are Specific to Human Language

Published: 23 Sept 2025, Last Modified: 17 Nov 2025UniReps2025EveryoneRevisionsBibTeXCC BY 4.0
Track: Proceedings Track
Keywords: language models, brain, fmri, language, topology
TL;DR: The paper shows that correlations between transformer language models and brain activity are genuine, specific to models trained on human language, and crucially dependent on positional encoding, ruling out generic statistical artifacts.
Abstract: Previous work has shown correlations between the hidden states of large language models and fMRI brain responses, on language tasks. These correlations have been taken as evidence of the representational similarity of these models and brain states. This study tests whether these previous results are robust to several possible concerns. Specifically this study shows: (i) that the previous results are still found after dimensionality reduction, and thus are not attributable to the curse of dimensionality; (ii) that previous results are confirmed when using new measures of similarity; (iii) that correlations between brain representations and those from models are specific to models trained on human language; and (iv) that the results are dependent on the presence of positional encoding in the models. These results confirm and strengthen the results of previous research and contribute to the debate on the biological plausibility and interpretability of state-of-the-art large language models.
Submission Number: 57
Loading