Multilingual Speech Recognition Initiative for African Languages

Kamel Gaanoun, Abdou Mohamed Naira, Anass Allak, Imade Benelallam

Published: 20 Mar 2023, Last Modified: 26 May 2024OpenReview Archive Direct UploadEveryoneCC BY 4.0

Abstract: This paper summarizes a speech recognition initiative for African languages. More precisely, we propose innovative approaches that address the low-resource property of these languages. For both monolingual and multilingual systems, our methods rely on self-supervised pre-trained models for multiple languages. We tested our method on seven African languages and dialects: Amharic, Darija, Fongbe, Sudanese, Swahili, Wolof, and Yoruba. We first trained monolingual models that were used as baselines, and then proposed proof-of-concepts for systems that handle multiple languages. Our multilingual systems were based on three scenarios:(a) we trained a single model by concate-nating the multilingual corpora;(b) we discussed this first model by testing another joint model that predicts the spoken language using language-specific tokens before the text transcription; and (c) we fed a one-hot encoder vector to the latent feature extractions before training the single model and for inference. For this purpose, a language identification model is required. We also investigated the impact of lexical ambiguity by removing diacritics from text in some languages.