Refining Multilingual Pronunciation through G2P and ASR Integration

ACL ARR 2024 June Submission5335 Authors

16 Jun 2024 (modified: 02 Jul 2024)ACL ARR 2024 June SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Abstract: Pronunciation dictionaries are indispensable for applications in speech synthesis and language learning, providing word pronunciations across diverse languages. Grapheme-to-Phoneme (G2P) models are pivotal in creating these dictionaries. However, variations in pronunciation can arise due to language, context, dialect, and acoustic conditions, potentially introducing inaccuracies. To address this, we introduce an approach to refine G2P model outputs by utilizing an alignment and weighting algorithm to integrate results from an acoustic phone recognizer across several high and low-resource languages.
Paper Type: Short
Research Area: Phonology, Morphology and Word Segmentation
Research Area Keywords: grapheme-to-phoneme conversion, pronunciation modeling
Contribution Types: NLP engineering experiment
Languages Studied: Occitan, Adyghe, Belarusian and around 120 more languages
Submission Number: 5335
Loading