Keywords: brain-computer interfaces, neural decoding, 3D reconstruction, 3D generation, neural encoding
Abstract: Brain-computer interfaces (BCI) have enabled breakthroughs like translating fMRI signals into images or videos. However, human perception operates in a dynamic 3D world, processing information across both spatial and temporal dimensions. In this work, we introduce 4D Mind Reading, a novel BCI function that generates 4D visuals—combining video and 3D structures—directly from fMRI signals. Building such a system is challenging, as training a model to generate 4D scenes from fMRI data requires paired fMRI–4D mappings, which are infeasible due to the instantaneous nature of brain responses that prevent simultaneous capture of multi-view stimuli. To address this, we propose Mind4D, an innovative brain-inspired fMRI conditioned 4D generation framework capable of learning asymmetric hierarchical representations from fMRI signals in a weakly supervised manner. Our approach captures both high-level and low-level representations, along with the decomposition of scene backgrounds and object foregrounds. By conditioning and integrating multiple generative priors for the foreground and background, Mind4D produces high-quality semantic 4D visuals. Extensive experiments show that Mind4D generates immersive 4D visuals semantically aligned with brain activity. Even when constrained to the reference view—the view the subject watched—our model outperforms the best fMRI-to-video approaches in CLIP-T and SSIM, achieving a 50% improvement in ICS-50 for semantic classification. We further highlight Mind4D’s potential in advancing neuroscience and clinical diagnosis. Our source code will be released.
Supplementary Material: zip
Primary Area: applications to neuroscience & cognitive science
Submission Number: 3044
Loading