Abstract: We train a bilingual Arabic-Hebrew language model in this study, using a transliterated version of Arabic texts to ensure representation by the same script. Given the morphological and structural similarities and large number of cognates in Arabic and Hebrew, we evaluate the performance of a language model that uses the same script for both languages on downstream tasks that require cross-lingual knowledge, such as machine translation. Promising results are obtained; our model outperforms all other PLMs on machine translation and outperforms other multilingual models in sentiment analysis for both languages.
Paper Type: short
Research Area: Multilinguality and Language Diversity
0 Replies
Loading