Keywords: speech decoding, sEEG, epidural ECoG, self-supervision, neuroscience
TL;DR: We propose the BrainStratify framework for decoding speech from intracranial neural signals, which disentangles intracranial neural signals to identify fine-grained states in a Coarse-to-Fine way.
Abstract: Decoding speech directly from neural activity is a central goal in brain-computer interface (BCI) research. In recent years, exciting advances have been made through the growing use of intracranial field potential recordings, such as stereo-ElectroEncephaloGraphy (sEEG) and ElectroCorticoGraphy (ECoG). These neural signals capture rich population-level activity but present key challenges: (i) task-relevant neural signals are sparsely distributed across sEEG electrodes, and (ii) multiple neural components (e.g., tongue \& jaw \& lips control in vSMC) are often entangled within the task-relevant functional groups in both sEEG and ECoG recordings. To address these challenges, we introduce a unified speech decoding framework enhanced by Coarse-to-Fine disentanglement, BrainStratify, which includes (i) identifying functional groups through spatial-context-guided temporal-spatial modeling, and (ii) disentangling neural components within the target functional group using Decoupled Product Quantization (DPQ). We evaluate BrainStratify on six datasets (including sEEG, (epidural) ECoG, etc.), spanning tasks like vocal production, speech perception, etc. Extensive experiments show that BrainStratify, as a unified framework for decoding speech from intracranial neural signals, significantly outperforms previous decoding methods. Overall, by combining data-driven stratification with neuroscience-inspired modularity, BrainStratify offers a robust and interpretable solution for decoding speech from intracranial recordings. Code and dataset will be publicly available.
Primary Area: applications to neuroscience & cognitive science
Submission Number: 560
Loading