TL;DR: First application of multimodal language pretraining to the domain of medical EEG to improve clinical phenotyping.
Abstract: Multimodal language modeling has enabled breakthroughs for representation learning, yet remains unexplored in the realm of functional brain data for clinical phenotyping. This paper pioneers EEG-language models (ELMs) trained on clinical reports and 15000 EEGs. We propose to combine multimodal alignment in this novel domain with timeseries cropping and text segmentation, enabling an extension based on multiple instance learning to alleviate misalignment between irrelevant EEG or text segments. Our multimodal models significantly improve over EEG-only models across four clinical evaluations and for the first time enable zero-shot classification as well as retrieval of both neural signals and reports. In sum, these results highlight the potential of ELMs, representing significant progress for clinical applications.
Lay Summary: AI models are capable of detecting abnormal brain activity indicative of disease, yet require extensive data labeling by experts. We present a novel approach that combines brain models and large language models. This enables brain models to teach themselves using existing natural language clinical reports which describe the patient and brain recording. We show such models are significantly better at disease detection, performing well with minimal or even no expert labels whatsoever.
Link To Code: https://github.com/SamGijsen/ELM
Primary Area: Applications->Neuroscience, Cognitive Science
Keywords: multimodal, neuroscience, eeg, medical, pretraining, representation learning
Submission Number: 12985
Loading