Discovering Divergences between Language Models and Human Brains

23 Sept 2023 (modified: 11 Feb 2024)Submitted to ICLR 2024EveryoneRevisionsBibTeX
Primary Area: applications to neuroscience & cognitive science
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Keywords: Natural Language Processing, NLP, Brain Imaging, Magnetoencephalography, MEG, Neuroscience, Cognitive Science, Interpretability, Deep Learning
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.
TL;DR: Fine-tuning language models on emotion and figurative language tasks improves their alignment with human brain responses, revealing systematic differences between LMs and human brains in language processing.
Abstract: Do machines and humans process language in similar ways? A recent line of research has hinted in the affirmative, demonstrating that human brain signals can be effectively predicted using the internal representations of language models (LMs). This is thought to reflect shared computational principles between LMs and human language processing. However, there are also clear differences in how LMs and humans acquire and use language, even if the final task they are performing is the same. Despite this, there is little work exploring systematic differences between human and machine language processing using brain data. To address this question, we examine the differences between LM representations and the human brain's responses to language, specifically by examining a dataset of Magnetoencephalography (MEG) responses to a written narrative. In doing so we identify three phenomena that, in prior work, LMs have been found to not capture well: emotional understanding, figurative language processing, and physical commonsense. We further fine-tune models on datasets related to these three phenomena, and find that LMs fine-tuned on tasks related to emotion and figurative language show improved alignment with brain responses. We emphasize the importance of understanding not just similarities between human and machine language processing, but also differences. Our work takes the first steps toward this goal in the context of narrative reading.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 6567
Loading