Refining Multilingual Pronunciation through G2P and ASR Integration

Refining Multilingual Pronunciation through G2P and ASR Integration

ACL ARR 2024 June Submission5335 Authors

16 Jun 2024 (modified: 02 Jul 2024)ACL ARR 2024 June SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Abstract: Pronunciation dictionaries are indispensable for applications in speech synthesis and language learning, providing word pronunciations across diverse languages. Grapheme-to-Phoneme (G2P) models are pivotal in creating these dictionaries. However, variations in pronunciation can arise due to language, context, dialect, and acoustic conditions, potentially introducing inaccuracies. To address this, we introduce an approach to refine G2P model outputs by utilizing an alignment and weighting algorithm to integrate results from an acoustic phone recognizer across several high and low-resource languages.

Paper Type: Short

Research Area: Phonology, Morphology and Word Segmentation

Research Area Keywords: grapheme-to-phoneme conversion, pronunciation modeling

Contribution Types: NLP engineering experiment

Languages Studied: Occitan, Adyghe, Belarusian and around 120 more languages

Submission Number: 5335

Loading