Evaluating Speech To Text For Children With Speech Impairment
Track: Tiny paper
Keywords: Speech, Speech defects
Abstract: Speech-to-text (STT) technology has advanced significantly, yet it remains inadequate for individuals with pediatric speech impairments such as dysarthria, dysphonia, stuttering, neurological disorders like cerebral palsy, and those with hearing loss–induced articulation disorders. This study evaluates the performance of three widely used STT models—Whisper, wav2vec 2.0, and LibriSpeech-based
systems—against these speech conditions using standard speech recognition metrics, including Word Error Rate (WER), Match Error Rate (MER), Word Information Lost (WIL), and Word Information Preserved (WIP). Results indicate thatall models exhibit substantial transcription errors when processing disordered pediatric speech. Whisper showed moderate success but frequently misclassified
disordered speech as noise. wav2vec 2.0 demonstrated improved adaptability but struggled with irregular rhythm and prosody. LibriSpeechbased models, trained on fluent adult speech, performed the worst, rendering pediatric speech nearly unintelligible. These findings underscore the systemic exclusion of speech impairments in mainstream STT development. Future models must incorporate diverse pediatric datasets, improve adaptability to disordered speech, and prioritize accessibility in their design.
Submission Number: 32
Loading