Keywords: Brain Computer Interface, EEG, VQ-VAE, Visual Decoding, Generative AI
Abstract: Reconstructing visual stimuli from non-invasive Electroencephalography (EEG) is an interesting but challenging task in brain decoding that involves translating noisy neural signals into images via fine-grained generative control. In this work, we introduce a novel and efficient framework that guides a visual token generator by conditioning the generation process on a high-level semantic understanding of the EEG signal. Our method leverages a pre-trained LaBraM-based architecture to derive a robust class prediction from the neural data.
In comparison to recent works that involve diffusion models, which require high computational resources and long inference times, our approach utilizes a lightweight and efficient token generator by building upon the bidirectional, parallel decoding capabilities of MaskGIT. This choice of components avoids the high computational requirements typical of large-scale diffusion processes. This focus on efficiency makes our approach not only easier to train but also more viable for potential real-time BCI applications where real-time feedback is crucial.
The core of our method is a straightforward yet powerful two-stage process. First, the EEG classifier distills the complex input signal into a class label. In the second stage, this label serves as a direct condition for the pre-trained token generator. The generator, guided by this class information, then produces a sequence of discrete latent codes that are semantically consistent with the original stimulus. This neurally-guided token sequence is finally rendered into a high-fidelity image by a pretrained decoder, completing an efficient pathway from brain activity to visual representation
Serve As Reviewer: ~Arnav_Bhavsar1
Submission Number: 49
Loading