Keywords: EEG, Contrastive learning, Dynamic channel clustering, Semantic alignment
TL;DR: EVA is the first unified framework to decode brain signals from images, videos, and 3D objects, achieving state-of-the-art performance through novel frequency processing and adaptive channel clustering.
Abstract: Decoding semantic information from electroencephalography (EEG) signals elicited by diverse visual stimuli remains a critical challenge in brain-computer interfaces and cognitive neuroscience. Existing approaches typically align EEG with single-modality visual stimuli but struggle to generalize across multiple modalities and temporal scales. We propose EVA (EEG-Vision Alignment), the first framework that unifies multi-scale EEG alignment with heterogeneous visual stimuli, including rapid image presentations, continuous video sequences, and 3D object rotations, within a single contrastive learning-based architecture. EVA’s Universal EEG Encoder features two key innovations: (1) a Frequency-Aware Dynamic Encoding (FADE) module that transforms EEG signals into the frequency domain via real-valued fast Fourier transform, enabling compact, adaptive representations through adjustable band-pass filtering; and (2) an Adaptive Channel Clustering (ACC) module that dynamically updates channel groupings using cross-attention and gradient-based optimization, capturing inter-channel synergies while mitigating noise. By optimizing EEG features to achieve both discriminative power for robust classification and semantic fidelity for high-quality reconstruction from brain signals, our framework achieves state-of-the-art performance across diverse tasks, including image retrieval, video classification, and 3D object recognition, on multiple datasets. Notably, our zero-shot reconstruction of 200 object categories from the THINGS-EEG dataset, using only aligned EEG features without textual or low-level cues, surpasses prior state-of-the-art by a significant margin. These results underscore EVA’s capability to extract robust, generalizable representations from EEG signals, demonstrating the superiority of our unified framework. Code will be released upon publication.
Supplementary Material: zip
Primary Area: applications to neuroscience & cognitive science
Submission Number: 12400
Loading