Learning Robust EEG Representations with a Large Spatiotemporal Transformer as a Foundation Model

26 Sept 2024 (modified: 05 Feb 2025)Submitted to ICLR 2025EveryoneRevisionsBibTeXCC BY 4.0
Keywords: Brain-computer interfaces, EEG representations, foundation model, masked autoencoder, spatial-temporal transformer
TL;DR: A novel approach to train a large EEG-BCI foundation model using self-supervised learning on diverse open public datasets jointly.
Abstract: Electroencephalography (EEG)-based brain-computer interfaces (BCIs) serve many control paradigms by relying on a variety in active brain regions and EEG features. Developing a universal EEG foundation model has been challenging due to the large variety in recording setups and experimental tasks. Additionally, researchers often contend with limited labeled data, making it difficult to utilize large deep-learning models effectively. While there have been successful attempts to develop EEG foundation models, few studies have systematically evaluated their adaptability across diverse BCI control paradigms. To address this gap, we propose a novel, yet simple spatiotemporal EEG transformer (ST-EEGFormer) that projects segments (“patches”) of raw EEG data into an embedding space enriched with a spatial and temporal embedding, allowing the model to effectively handle EEG data exhibiting various channel set-ups and time lengths. To improve data efficiency, we first employed a masked autoencoder (MAE) task to pre-train the ST-EEGFormer in a self-supervised learning manner on a dataset combining six different motor imagery (MI) datasets, a P300 dataset, and a steady-state visual evoked potential (SSVEP) dataset, all of which are public. Next, we benchmarked the pre-trained model, after fine-tuning, on diverse downstream classification tasks. To evaluate the generalization capability, we conducted additional experiments on two public datasets, not used for pre-training: a seizure classification dataset and an online MI BCI dataset. We compared the performance against a simple linear model, EEGNet (a classic CNN-based benchmark model), the state-of-the-art supervised EEG Conformer model, and two foundation models, BIOT and Large Brain Model (LaBraM). The pre-trained ST-EEGFormers could learn robust EEG representations, achieving higher classification accuracies than the benchmarked models across all eight pre-training datasets and exhibiting strong generalization on new datasets with limited training data. Finally, we report several visualizations of the model including the features on which the results are based.
Primary Area: applications to neuroscience & cognitive science
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 7085
Loading

OpenReview is a long-term project to advance science through improved peer review with legal nonprofit status. We gratefully acknowledge the support of the OpenReview Sponsors. © 2025 OpenReview