BrainDiffNet: Unified Semantic Encoders for Diffusion-based EEG-to-Image Generation

Published: 19 Aug 2025, Last Modified: 24 Sept 2025BSN 2025EveryoneRevisionsBibTeXCC BY 4.0
Confirmation: I have read and agree with the IEEE BSN 2025 conference submission's policy on behalf of myself and my co-authors.
Keywords: Diffusion, Masked-Auto-encoders, EEG decoder, Image Reconstruction
Abstract: Gaining insight into the brain’s visual representation through reconstructing what we see from brain activity is of immense importance and interest. Though fMRI and MEG achieve high-quality image reconstruction and classification, their cost and size restrict broader real-world applications, particularly outside clinical settings. In contrast, although Electroencephalography (EEG) is a cost-effective, non-invasive tool producing high temporal resolution signals, it remains less explored primarily due to its susceptibility to noise and complex spatio-temporal characteristics. To address these, we propose BrainDiffNet, an effective EEG-to-Image generation model that leverages a subject’s contextual and EEG spatio-temporal information to guide a fine-tuned Stable Diffusion model, resulting in high-quality, semantically relevant images from brain activity. A robust Temporal Masked Autoencoder, designed for high-resolution EEG, enables the model to effectively extract features and manage noisy or incomplete EEG query representations. In-depth evaluation using the large-scale EEG-ImageNet dataset demonstrates the outperformance of BrainDiffNet in both tasks: Object Classification and Image Reconstruction. In fact, the model significantly outperforms state-of-the-art methods, achieving a 15 − 20% higher accuracy in classification across all granularity levels and a 7−12% improvement in all feature-specific two-way identification metrics for image reconstruction.
Track: 13. General sensing and systems
NominateReviewer: Sreyasee Das Bhattacharjee, email: sreyasee@buffalo.edu
Submission Number: 94
Loading