One Question at a Time: A Semantic Bottleneck for Interpretable Visual Brain Decoding from fMRI

Published: 23 Sept 2025, Last Modified: 17 Nov 2025UniReps2025EveryoneRevisionsBibTeXCC BY 4.0
Supplementary Material: zip
Track: Extended Abstract Track
Keywords: Brain Decoding, Interpretability, Vision, Neuroscience
TL;DR: We propose an interpretable brain decoding framework that maps fMRI signals into concepts before reconstructing images, which preserves reconstruction accuracy while exposing voxel-level concept activations, offering new neuroscientific insights.
Abstract: Decoding of visual stimuli from noninvasive neuroimaging techniques such as functional magnetic resonance (fMRI) has advanced rapidly in the last few years; yet, most high-performing brain decoding models rely on complicated, non-interpretable latent spaces (e.g., CLIP). In this study, we present an interpretable brain decoding framework; our key innovation is the insertion of a semantic bottleneck into BrainDiffuser, a well established brain decoding pipeline. Specifically, we build a $214-\text{dimensional}$ binary interpretable space $\mathcal{L}$ for images, in which each dimension answers a specific question about the image (e.g., "Is there a person?", "Is it outdoors?"). A first ridge regression maps the activity of the voxel to this semantic space. Because this mapping is linear, its weight matrix can be visualized as maps of voxel importance for each dimension of $\mathcal{L}$, revealing which cortical regions most influence each semantic dimension. A second regression then transforms these concept vectors into CLIP embeddings required to produce the final decoded image, conditioning the BrainDiffuser model. We find that voxel-wise weight maps for individual questions are highly consistent with canonical category-selective regions in the visual cortex (face, bodies, places, words), simultaneously revealing that activation distributions, not merely location, bear semantic meaning in the brain. Visual brain decoding performances are only slightly lower compared to the original BrainDiffuser metrics (e.g., CLIP similarity is decreased by $\leq 4\%$), but offering substantial gains in interpretability and neuroscientific insights. These results show that our interpretable brain decoding pipeline enables voxel-level analysis of semantic representations in the human brain without sacrificing decoding accuracy.
Submission Number: 38
Loading