Keywords: Automatic speech recognition, Clinical speech analysis, Language impairment detection
TL;DR: Whisper, an automatic speech recognition model, can be fine-tuned to transcribe single-word post-stroke speech and support assessment of speech and language impairments, although its generalizability across clinical conditions remains limited.
Abstract: Detailed assessment of language impairment following stroke remains a cognitively
complex and clinician-intensive task, limiting timely and scalable diagnosis. Auto-
matic Speech Recognition (ASR) foundation models offer a promising pathway
to augment human evaluation, but their effectiveness in the context of speech
and language impairment remains uncertain. In this study, we evaluate whether
Whisper, a state-of-the-art ASR foundation model, can be applied to transcribe and
analyze speech from patients with stroke during a picture-naming task. We assess
both verbatim transcription accuracy and the model’s ability to support downstream
prediction of language function, which has major implications for outcomes after
stroke. Our results show that the baseline Whisper model performs poorly on
single-word speech utterances. Nevertheless, fine-tuning Whisper significantly
improves transcription accuracy (reducing Word Error Rate by 87.72% in healthy
speech and 71.22% in speech from patients). Further, learned representations
from the model enable accurate prediction of speech quality (average F1 Macro of
0.74 for healthy, 0.75 for patients). However, evaluations on an unseen (TORGO)
dataset reveal limited generalizability, highlighting the inability of Whisper to
perform zero-shot transcription of single-word utterances on out-of-domain clinical
speech and emphasizing the need to adapt models to specific clinical populations.
While challenges remain in cross-domain generalization, these findings highlight
the potential of foundation models, when appropriately fine-tuned, to advance
automated speech assessment and rehabilitation for stroke-related impairments.
Submission Number: 94
Loading