Abstract: Nowadays, task-based functional magnetic resonance imaging (fMRI) plays a pivotal role in understanding the functional organization of human brain. Despite the introduction of various deep learning models like DBN, DRAE, and DRNN for modelling task fMRI data, they struggle to concurrently capture spatial and temporal features from high-dimensional fMRI data. Inspired by the success of transformer in natural language processing and computer vision, we explore its potential in task fMRI data analysis. Thus, we propose a novel framework, spatiotemporal disentangled Fusion-Transformer (ST-FiT), employing a dual-branch Transformer structure to effectively extract both spatial and temporal features from task fMRI data simultaneously. Using seven diverse task fMRI datasets for validation, ST-FiT demonstrates its capacity in discerning spatial network patterns and reconstructing meaningful temporal features. Furthermore, comprehensive comparisons with state-of-the-art deep learning models highlight the superior performance of our model in task fMRI analysis.
Loading