Keywords: BCI, neuroAI, language decoding, invasive neural recordings
TL;DR: we demonstrate that cross-subject training of models for speech decoding is feasible and generalization to new subject as well.
Abstract: Brain-to-text systems have recently achieved impressive performance when trained on single-participant data, but remain limited by uninvestigated cross-subject generalization.
We present the first neural-to-phoneme decoder trained jointly on the two largest intracortical speech datasets (Willett et al. 2023; Card et al. 2024), introducing day- and dataset-specific affine transforms to align neural activity into a shared space.
A hierarchical GRU decoder with intermediate CTC supervision and feedback connections further mitigates the conditional-independence assumption of standard CTC loss. Our model matches or outperforms within-subject baselines while being trained across participants, and adapts to unseen subjects using only a linear transform or brief fine-tuning.
On an independent inner-speech dataset (Kunz et al. 2025), our approach demonstrate generalization, by training only subject day specific transforms. These results highlight cross-subject pretraining as a practical path toward
scalable and clinically deployable speech BCIs.
Supplementary Material: zip
Primary Area: applications to neuroscience & cognitive science
Submission Number: 9362
Loading