DTCA: Dual-Branch Transformer with Cross-Attention for EEG and Eye Movement Data Fusion

Xiaoshan Zhang, Enze Shi, Sigang Yu, Shu Zhang

Published: 01 Jan 2024, Last Modified: 16 May 2025MICCAI (2) 2024EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Integrating Electroencephalography (EEG) and eye movements (EM) provides a comprehensive understanding of brain dynamics. However, effectively capturing essential information from EEG and EM poses challenges. Previous studies have investigated aligning and identifying correlations between them, yet they have not fully utilized the deep dynamic relationship and complementary features inherent in EEG and EM data. To address this issue, we propose the Dual-Branch Transformer with Cross-Attention (DTCA) framework. It encodes EEG and EM data into a latent space, leveraging a multimodal fusion module to learn the facilitative information and dynamic relationships between EEG and EM data. Utilizing cross-attention with pooling computation, DTCA captures the complementary features and aggregates promoted information. Extensive experiments on multiple open datasets show that DTCA outperforms previous state-of-the-art methods: 99.15% on SEED, 99.65% on SEED-IV, and 86.05% on SEED-V datasets. We also visualize confusion matrices and features to demonstrate how DTCA works. Our findings demonstrate that (1) EEG and EM effectively distinguish changes in brain states during tasks such as watching videos. (2) Encoding EEG and EM into a latent space for fusion facilitates learning promoted information and dynamic correlation associated with brain states. (3) DTCA efficiently fuses EEG and EM data to leverage their synergistic effects in understanding the brain’s dynamic processes and classifying brain states.