Abstract: Visual decoding seeks to identify or reconstruct visual stimuli perceived by individuals based on neural activity. Although functional magnetic resonance imaging (fMRI) and magnetoencephalography (MEG) have achieved remarkable success in visual decoding, their high costs, limited portability, and real-time processing challenges necessitate alternative approaches. Electroencephalography (EEG) provides a promising solution due to its cost-effectiveness, high temporal resolution, and suitability for real-time applications. However, conventional EEG-based encoders often rely on simplistic architectures, which limits their ability to fully capture the spatial, spectral, and temporal (SST) features of EEG signals, leading to suboptimal performance. In this study, we propose a novel brain decoding framework utilizing a state-of-the-art EEG encoder specifically designed to capture the SST characteristics of EEG signals. The proposed framework was evaluated on the THINGS-EEG dataset. It achieved a mean top-l accuracy of 19.7% and a top-5 accuracy of 50.7% in zero-shot retrieval tasks, outperforming conventional EEG encoders. These results demonstrate the potential of our method in advancing EEG-based visual decoding task.
Loading