Seeing the imagined: a latent functional alignment in visual imagery decoding from fMRI data

Published: 02 Mar 2026, Last Modified: 06 Mar 2026ICLR 2026 Re-Align WorkshopEveryoneRevisionsBibTeXCC BY 4.0
Track: long paper (up to 10 pages)
Domain: cognitive science
Abstract: Recent progress in visual brain decoding from fMRI has been enabled by large-scale datasets such as the Natural Scenes Dataset (NSD) and powerful diffusion-based generative models. While current pipelines are primarily optimized for perception, their performance under mental-imagery remains less well understood. In this work, we study how a state-of-the-art (SOTA) perception decoder (DynaDiff) can be adapted to reconstruct imagined content from the Imagery-NSD benchmark. We propose a latent functional alignment approach that maps imagery-evoked activity into the pretrained model’s conditioning space, while keeping the remaining components frozen. To mitigate the limited amount of matched imagery–perception supervision, we further introduce a retrieval-based augmentation strategy that selects semantically related NSD perception trials. Across four subjects, latent functional alignment consistently improves high-level semantic reconstruction metrics relative to the frozen pretrained baseline and a voxel-space ridge alignment baseline, and enables above-chance decoding from multiple cortical regions. These results suggest that semantic structure learned from perception can be leveraged to stabilize and improve visual imagery decoding under out-of-distribution conditions.
Presenter: ~Fabrizio_Spera1
Submission Number: 48
Loading