MIND THE GAP: ALIGNING THE BRAIN WITH LANGUAGE MODELS REQUIRES A NONLINEAR AND MULTIMODAL APPROACH

Danny Dongyeop Han; Yunju Cho; Jiook Cha; Jay-Yoon Lee

MIND THE GAP: ALIGNING THE BRAIN WITH LANGUAGE MODELS REQUIRES A NONLINEAR AND MULTIMODAL APPROACH

Danny Dongyeop Han, Yunju Cho, Jiook Cha, Jay-Yoon Lee

28 Sept 2024 (modified: 05 Feb 2025)Submitted to ICLR 2025EveryoneRevisionsBibTeXCC BY 4.0

Keywords: fMRI language encoding, brain LLM alignment, neuroscience, large language models, neurolinguistics

Abstract: Speech comprehension involves complex, nonlinear, and multimodal processes within the brain, integrating auditory signals with linguistic and semantic information across widespread brain networks. Traditional brain encoding models, often relying on linear mappings from unimodal features, fall short in representing these intricate mechanisms. In this study, we introduce a nonlinear, multimodal encoding model that combines audio and linguistic features extracted from pre-trained deep learning models (e.g., LLAMA and Whisper). These nonlinear architectures and early fusion mechanisms significantly enhance cross-modal integration, achieving a 14.4% increase in average normalized correlation coefficient and 7.7% increase in average single-story correlation compared to the previous state-of-the-art model relying on weighted averaging of linear unimodal predictions. Moreover, this improved performance reveals novel insights into the brain's functional organization, demonstrating how auditory and semantic information are nonlinearly fused within regions linked to motor control, somatosensory processing, and higher-level semantic representation. Our findings provide empirical support for foundational neurolinguistic theories, including the Motor Theory of Speech Perception, embodied semantic memory, and the Convergence Zone model, revealing novel insights into neural mechanisms otherwise impossible with simpler encoder models. By emphasizing the critical role of nonlinearity and multimodality in brain encoding models, our work bridges neural mechanisms and computational modeling, paving the way for the development of more biologically inspired, brain-aligned artificial intelligence systems.

Primary Area: applications to neuroscience & cognitive science

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.

Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Submission Number: 13040

Loading