Self-supervised EEG pretraining amplifies individual differences in neural representations

Published: 02 Mar 2026, Last Modified: 14 May 2026ICLR 2026 Re-Align WorkshopEveryoneRevisionsBibTeXCC BY 4.0
Track: tiny / short paper (up to 5 pages)
Domain: neuroscience
Abstract: Self-supervised learning can be leveraged to build generalized representations from massive and heterogeneous datasets. In the case of neural data, trained models have been shown to generalize across domains, thus sometimes called foundation models. Generalization is typically measured with downstream decoding of behavior, such as movements. Here, we replicate these benchmarks and provide a complementary evaluation to existing benchmarks for NeuroGPT, a foundation model of electroencephalography (EEG). Using simultaneous recording of EEG and BOLD signals over 18 human subjects, we test (1) the extent to which the model captures subject-specific idiosyncrasies, and (2) whether the model EEG embeddings improve the prediction of simultaneously recorded BOLD signals. We find that NeuroGPT embeddings reliably amplify individual idiosyncrasies and improve BOLD prediction relative to raw EEG and randomly initialized models. These results suggest that self-supervised pre-training enriches, rather than homogenizes, subject-level representational structure that can be leveraged to align different data modalities.
Presenter: ~Joao_Barbosa1
Submission Number: 73
Loading