# HEAR: An EEG Foundation Model with Heterogeneous Electrode Adaptive Representation

**HEAR** is a foundation model framework designed for electroencephalography (EEG) with support for **arbitrary and
unseen electrode layouts**.  
It enables scalable learning across heterogeneous datasets, handling up to **1,132 electrodes** in both pretraining and
finetuning stages.

This is an initial code release, including preprocessing and finetuning scripts for datasets such as **BCIC IV 2b** and
**SHUDB**.

---

## 📄 Framework Overview

Electroencephalography (EEG) is a critical modality in neuroscience and brain-computer interface research. However,
electrode layout heterogeneity not only impedes cross-dataset generalization but also obstructs the further scaling of
unified, large-scale foundation models. In this paper, we propose HEAR, the first EEG foundation model that explicitly
supports arbitrary and unseen electrode layouts (scaling to handle up to 1,132 electrodes) in both pretraining and
finetuning stages.

We first construct a comprehensive heterogeneous EEG dataset spanning 8,782 hours of data, 150+ channel configurations,
and 20 datasets across 12 paradigms. Built upon the constructed dataset, HEAR leverages a learnable coordinate-based
soft spatial embedding that projects electrodes into a shared canonical space and unites temporal-slice channel
attention with a spatially-guided bias transformer to capture dynamic spatiotemporal dependencies across heterogeneous
layouts and tasks.

We evaluate HEAR on seven downstream tasks across nine EEG datasets, benchmarking it against four state-of-the-art EEG
foundation models with public weights. Our results demonstrate consistent and significant improvements in
generalization, adaptability, and overall performance across highly heterogeneous electrode settings and paradigms.
Cross-layout evaluations confirm that HEAR effectively generalizes to previously unseen electrode configurations.

![HEAR Framework Part 1](figures/part1.jpg)

[//]: # (![HEAR Framework Part 2]&#40;figures/part2.jpg&#41;)

---

## 📁 File Overview

| File/Folder                   | Description                             |
|-------------------------------|-----------------------------------------|
| `combined_montage.json`       | Unified electrode coordinate definition |
| `finetuning.py`               | Model finetuning script                 |
| `datasets/make_bcic_iv_2b.py` | Preprocessing pipeline for BCIC IV 2b   |
| `datasets/make_shudb.py`      | Preprocessing pipeline for SHUDB        |
| `models.py`                   | HEAR model architecture                 |

---

## ⚙️ Requirements

Python 3.8+ is recommended. Install dependencies via:

```bash
pip install timm==0.4.12 \
            Pillow \
            blobfile \
            mypy \
            numpy \
            pytest \
            requests \
            einops \
            deepspeed==0.4.0 \
            scipy \
            pyhealth==1.1.4 \
            h5py \
            mne==1.4.2
```

---

## 📚 Dataset Collection

HEAR is pretrained and evaluated on a diverse set of EEG datasets, covering a wide range of paradigms and electrode
layouts:

| Dataset   | Link                                                                                                                                                                             |
|-----------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| BCI-IV-1  | [bbci.de/competition/iv/#datasets](http://bbci.de/competition/iv/#datasets)                                                                                                      |
| BCI-IV-2B | [bbci.de/competition/iv/#datasets](http://bbci.de/competition/iv/#datasets)                                                                                                      |
| BCI-IV-2A | [bbci.de/competition/iv/#datasets](http://bbci.de/competition/iv/#datasets)                                                                                                      |
| EEGMMIDB  | [physionet.org/content/eegmmidb/1.0.0](https://physionet.org/content/eegmmidb/1.0.0/)                                                                                            |
| LargeMI   | [figshare.com/.../3917698](https://figshare.com/collections/A_large_electroencephalographic_motor_imagery_dataset_for_electroencephalographic_brain_computer_interfaces/3917698) |
| SHUDB     | [figshare.com/.../19228725/1](https://figshare.com/articles/software/shu_dataset/19228725/1)                                                                                     |
| HGD       | [github.com/robintibor/high-gamma-dataset](https://github.com/robintibor/high-gamma-dataset)                                                                                     |
