Signatures of the Auditory Cortex Reveal Discrepancies Across Speech Recognition Models

Gasser Elbanna; Ivy Brundege; Josh Mcdermott

Signatures of the Auditory Cortex Reveal Discrepancies Across Speech Recognition Models

Gasser Elbanna, Ivy Brundege, Josh Mcdermott

Published: 23 Sept 2025, Last Modified: 17 Nov 2025UniReps2025EveryoneRevisionsBibTeXCC BY 4.0

Track: Extended Abstract Track

Keywords: Brain-model alignment, Speech ANNs, Speech recognition

TL;DR: This paper demonstrates how domain-specific in-silico fMRI simulation experiments on models can provide an insightful and a complementary metric to measure brain-model alignment.

Abstract: Speech recognition is central to human communication, yet the neural computations that support it are not fully understood. Artificial neural networks (ANNs) have shown promise as models of sensory systems, and could provide a way to generate candidate hypotheses for the neural representations and mechanisms underlying speech recognition. However, speech-specific ANNs have not been systematically evaluated for this purpose. Here we assess subword-based, word-based, and self-supervised speech models using in-silico simulations of auditory fMRI experiments that probe domain-specific response signatures in human auditory cortex. We find that models optimized for subword units (e.g., phoneme-level) best recapitulate the characteristic patterns of cortical responses, whereas word-level and self-supervised models show worse alignment. These results show how simulations of neuroimaging experiments can reveal facets of model–brain correspondence, providing a complementary diagnostic for refining both speech models and benchmarks of brain–model alignment.

Submission Number: 102

Loading