Learning and Interpreting Multiple Representations of Semantics in a Neurobiological System

27 Sept 2024 (modified: 15 Nov 2024)ICLR 2025 Conference Withdrawn SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Pruning, probing, representation, explainableAI, semantics, brain
TL;DR: We present an objective method for interpreting the semantic dimensions that structure the representation of a set of lexical items in the human brain.
Abstract: A defining feature of computation in the human brain is that different regions can manifest different representations of the same object set. Here we introduce a novel method to learn and interpret multiple neural representations of lexical objects within specific, topographically-defined brain areas. Our approach fine-tunes a pre-trained language model (LM) for each brain region of interest, resulting in better alignment of the LM’s representational space with that of the corresponding brain area. This alignment is achieved through supervised structural pruning of LM features, which selects a subset of features most relevant to the target brain region. We then interpret these retained features using a linear probing task to identify the semantic information they encode. Both the pruning and probing steps are validated through out-of-sample testing, with pruning significantly improving the prediction of brain representations. This method advances on existing approaches by $i$) eliminating the reliance on hand-crafted encoders, reducing potential biases; $ii$) optimizing the alignment process via data-driven learning; and $iii$) providing interpretability of the semantic features in a black-box LM. From a neurobiological perspective, we find that brain regions encoding social and cognitive aspects of lexical items consistently also represent their sensory-motor features, though the reverse does not hold.
Primary Area: applications to neuroscience & cognitive science
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.
Reciprocal Reviewing: I understand the reciprocal reviewing requirement as described on https://iclr.cc/Conferences/2025/CallForPapers. If none of the authors are registered as a reviewer, it may result in a desk rejection at the discretion of the program chairs. To request an exception, please complete this form at https://forms.gle/Huojr6VjkFxiQsUp6.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 11783
Loading