Divergences between Language Models and Human Brains

Published: 25 Sept 2024, Last Modified: 06 Nov 2024NeurIPS 2024 posterEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Natural Language Processing, NLP, Brain Imaging, Neuroimaging, Magnetoencephalography, MEG, Neuroscience, Cognitive Science, Interpretability, Deep Learning
TL;DR: Language models differ from human brains in social/emotional intelligence and physical commonsense. Fine-tuning language models on these domains improves their alignment with human understanding.
Abstract: Do machines and humans process language in similar ways? Recent research has hinted at the affirmative, showing that human neural activity can be effectively predicted using the internal representations of language models (LMs). Although such results are thought to reflect shared computational principles between LMs and human brains, there are also clear differences in how LMs and humans represent and use language. In this work, we systematically explore the divergences between human and machine language processing by examining the differences between LM representations and human brain responses to language as measured by Magnetoencephalography (MEG) across two datasets in which subjects read and listened to narrative stories. Using an LLM-based data-driven approach, we identify two domains that LMs do not capture well: social/emotional intelligence and physical commonsense. We validate these findings with human behavioral experiments and hypothesize that the gap is due to insufficient representations of social/emotional and physical knowledge in LMs. Our results show that fine-tuning LMs on these domains can improve their alignment with human brain responses.
Primary Area: Neuroscience and cognitive science (neural coding, brain-computer interfaces)
Flagged For Ethics Review: true
Submission Number: 13619
Loading