Keywords: Speech neuroprosthesis, surface electromyogram, symmetric positive definite matrices.
TL;DR: We present a non-invasive speech neuroprosthesis for translating silently articulated speech into text. Such a wearable device is useful for laryngectomy patients.
Abstract: We present a high-bandwidth, egocentric neuromuscular speech interface that translates silently voiced articulations directly into text. We record surface electromyographic (EMG) signals from multiple articulatory sites on the face and neck as participants *silently* articulate speech, enabling direct EMG-to-text translation. Such an interface has the potential to restore communication for individuals who have lost the ability to produce intelligible speech due to laryngectomy, neuromuscular disease, stroke, or trauma-induced damage (e.g., radiotherapy toxicity) to the speech articulators. Prior work has largely focused on mapping EMG collected during *audible* articulation to time-aligned audio targets or transferring these targets to *silent* EMG recordings, which inherently requires audio and limits applicability to patients who can no longer speak. In contrast, we propose an efficient representation of high-dimensional EMG signals and demonstrate direct sequence-to-sequence EMG-to-text conversion at the phonemic level without relying on time-aligned audio. All data, code, and model checkpoints are open-sourced at https://github.com/HarshavardhanaTG/emg2speech.
Submission Number: 26
Loading