Keywords: Visual Decoding, Brain-Computer Interface, EEG, Contrastive Learning
Abstract: Decoding visual representations from brain signals has attracted significant attention in both neuroscience and artificial intelligence. However, the extent to which EEG signals contain actual visual information remains unclear. Existing approaches typically attempt to directly align EEG and image representations, yet the substantial knowledge gap between these two modalities prevents effective alignment and limits our understanding of the human visual system. In this paper, we propose an EEG–image multi-fusion strategy that leverages multiple pre-trained visual encoders with distinct inductive biases to capture multi-scale and hierarchical visual representations, while employing a contrastive learning objective to achieve effective alignment between EEG and visual embeddings. Furthermore, we introduce a Fusion Prior, which learns a stable mapping on large-scale visual data and subsequently matches EEG features to this pre-trained prior, thereby enhancing distributional consistency across modalities. Both quantitative and qualitative experiments demonstrate that our method achieves a strong balance between retrieval and reconstruction capabilities.
Primary Area: applications to neuroscience & cognitive science
Submission Number: 18298
Loading