Virtual Reality-Based Stroop Test for Mild Cognitive Impairment Detection via KWS-TA-CNN-PE Network Using Eye-Tracking Signals

Menglan Ruan, Bin Liu, Wenyuan Li, Lei Jin, Leqi Yang, Régine Le Bouquin-Jeannès, Jie Li, Yudong Zhang, Chunfeng Yang, Wentao Xiang

Published: 2025, Last Modified: 26 Mar 2026IEEE Internet Things J. 2025EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Early detection of mild cognitive impairment (MCI) is critical, as timely interventions during this transitional phase can slow or even prevent progression to Alzheimer’s disease (AD). Eye-tracking (ET) signals recorded during virtual reality (VR)-based cognitive tasks present strong potential for MCI detection, as they integrate the immersive multisensory environment of VR with the rich temporal and behavioral information of ET signals. In this study, we built a VR-based system incorporating four Stroop-inspired tasks, proposed a corresponding ET signal dataset, and introduced a lightweight network for MCI detection. Using our system, a 38-subject MCI dataset was constructed, including 17 individuals with MCI and 21 healthy control (HC) individuals. However, the inherently nonstationary and redundant characteristics of ET signals often pose challenges for accurate classification. To overcome these issues, the proposed network, Kymatio-based wavelet scattering (KWS)-temporal attention (TA)-convolutional neural network (CNN)-probabilistic ensemble (PE), includes four key components: 1) KWS transform to calculate the wavelet scattering coefficients, generating time-robust features and minimizing memory requirements via a depth-first traversal strategy; 2) TA, which dynamically prioritizes the most informative time steps and suppresses noise in ET signals; 3) a 1-D CNN that extracts localized temporal patterns; and 4) a PE strategy that combines the outputs across the four tasks for final classification. Experimental results of our KWS-TA-CNN-PE network, using leave-one-subject-out (LOSO) cross validation and a blind-test (BT) protocol, exhibit strong performance with accuracies of 0.8629 and 0.8621, respectively. The promising results highlight the clinical potential of the proposed system, dataset, and network.
Loading