Traditional Mongolian-to-Cyrillic Mongolian Conversion Method Based on the Combination of Rules and Transformer

Abstract: Mongolian words are composed of stems and suffixes, resulting in a large vocabulary of Mongolian and many Out-of-vocabulary (OOV) words. Using only a dictionary cannot convert Traditional Mongolian OOV words. Therefore, this paper proposes a Traditional Mongolian to Cyrillic Mongolian conversion method based on combining rules and Transformer. This method uses rules based approach for converting in-vocabulary words and the Transformer model to convert OOV words. Experimental results show that using a Transformer outperforms the previous best OOV words conversion model, and the conversion method based on the combination of rules and Transformer has WER of 13.18%.
0 Replies
Loading