TinnitusLLM: A Multimodal Large Language Model Framework for Tinnitus Diagnosis Through EEG-fMRI Fusion Learning
Abstract: Accurate tinnitus diagnosis is crucial for enabling timely therapeutic intervention and longitudinal treatment monitoring. While non-invasive neuroimaging modalities-particularly electroencephalography (EEG) with millisecond temporal resolution and functional magnetic resonance imaging (fMRI) with millimeter spatial resolution- provide complementary neural features, existing diagnostic approaches remain constrained to unimodal analysis of EEG or fMRI data, inherently limiting diagnostic precision and clinical generalizability. This paper introduces TinnitusLLM, the first multimodal large language model (LLM) framework that synergistically integrates EEG and fMRI features for tinnitus diagnosis. To enable LLM-based interpretation of neural signals, this framework integrates three key components: (1) a neuroinspired positional encoding mechanism that injects neurophysiological priors into the embedding space, enabling neurologically grounded, dynamic positional mapping of EEG and fMRI tokens; (2) multimodal autoregressive pretraining on more than 500 hours of EEG and 250 hours of fMRI data to learn causally informed predictive representations; and (3) fine-tuning with a cross-modal, subject-invariant adversarial learning strategy that enforces subject-independent constraints in the shared cross-modal feature space, thereby substantially improving diagnostic robustness across subjects. We validate TinnitusLLM through comprehensive experiments on a rigorously collected multimodal dataset containing 20 participants. Quantitative evaluations demonstrate that TinnitusLLM achieves superior cross-subject diagnostic accuracy compared to the state-of-the-art baseline methods. These results underscore TinnitusLLM's potential as a clinically viable framework for objective tinnitus assessment through multimodal neural decoding.
External IDs:doi:10.1109/jbhi.2026.3670122
Loading