Improving ASR with Synthetic Intra-Sentential Code-Switched Speech Generated Using Linguistically-Constrained LLMs and Multilingual TTS
Keywords: Automatic Speech Recognition, Code-Switching, Low-Resource Languages, Hausa-English, Hausa-Yoruba, Synthetic Speech
TL;DR: We generate linguistically valid code-switched speech using LLMs and multilingual TTS, improving ASR performance for low-resource languages.
Abstract: Code-switching poses significant challenges for automatic speech recognition (ASR), particularly for low-resource language pairs where annotated bilingual speech data is scarce. In this work, we propose a framework for generating synthetic intra-sentential code-switched speech using large language models and multilingual text-to-speech synthesis. Code-switched text is generated from parallel corpora using a linguistically guided approach that combines Matrix Language Frame theory with a phrase-level extension of the Equivalence Constraint Theory. The generated text is converted into speech using MMS multilingual TTS and normalized with OpenVoice voice cloning to ensure consistent speaker identity. Experiments on Hausa–Yoruba and Hausa–English show improved ASR performance and more natural switching patterns.
Track: Track 2: ML Research by Muslim Authors
Email Sharing: We authorize the sharing of all author emails with Program Chairs.
Data Release: We authorize the release of our submission and author names to the public in the event of acceptance.
Non Archival Confirmation: I understand that submissions to MusIML are non-archival and can be submitted to other venues.
Submission Number: 50
Loading