Bridging Brains and Concepts: Interpretable Visual Decoding from fMRI with Semantic Bottlenecks

Published: 18 Sept 2025, Last Modified: 29 Oct 2025NeurIPS 2025 posterEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Brain Decoding, Interpretability, Vision, Neuroscience
TL;DR: We introduce a brain decoding framework that maps fMRI data to interpretable semantic concepts for image reconstruction. This pipeline maintains high reconstruction quality, revealing voxel-level concept activations, enabling neuroscientific insight.
Abstract: Decoding of visual stimuli from noninvasive neuroimaging techniques such as functional magnetic resonance (fMRI) has advanced rapidly in the last years; yet, most high-performing brain decoding models rely on complicated, non-interpretable latent spaces. In this study we present an interpretable brain decoding framework that inserts a semantic bottleneck into BrainDiffuser, a well established, simple and linear decoding pipeline. We firstly produce a $214-\text{dimensional}$ binary interpretable space $\mathcal{L}$ for images, in which each dimension answers to a specific question about the image (e.g., "Is there a person?", "Is it outdoors?"). A first ridge regression maps voxel activity to this semantic space. Because this mapping is linear, its weight matrix can be visualized as maps of voxel importance for each dimension of $\mathcal{L}$, revealing which cortical regions influence mostly each semantic dimension. A second regression then transforms these concept vectors into CLIP embeddings required to produce the final decoded image, conditioning the BrainDiffuser model. We found that voxel-wise weight maps for individual questions are highly consistent with canonical category-selective regions in the visual cortex (face, bodies, places, words), simultaneously revealing that activation distributions, not merely location, bear semantic meaning in the brain. Visual brain decoding performances are only slightly lower compared to the original BrainDiffuser metrics (e.g., the CLIP similarity is decreased by $\leq 4$% for the four subjects), yet offering substantial gains in interpretability and neuroscientific insights. These results show that our interpretable brain decoding pipeline enables voxel-level analysis of semantic representations in the human brain without sacrificing decoding accuracy.
Supplementary Material: zip
Primary Area: Neuroscience and cognitive science (e.g., neural coding, brain-computer interfaces)
Submission Number: 12412
Loading