Published: 03 Mar 2023, Last Modified: 15 Apr 2023AfricaNLP 2023Readers: Everyone
Keywords: Error Analysis, Machine Translation, Human Evaluation, Tigrinya-English
Abstract: Machine translation (MT) is an important language technology that can democratize access to information. In recent years, we have seen some progress in the development and deployment in production of MT systems for a handful of African languages. Evaluating the quality of such systems is fundamental to accelerating progress in this area. Tigrinya is a language that is spoken by more than 10 million native speakers mainly in Tigray, Ethiopia and Eritrea. In this work, we evaluated the current status of state-of-the-art MT systems that support the translation of Tigrinya to and from English: Google translate, Microsoft translator, and Lesan. We systematically collected a dataset for evaluating Tigrinya MT systems across four domains: Arts and Culture, Business and Economics, Politics as well as Science and Technology. The dataset contains snippets from 806 articles gathered from diverse sources. We performed an in-depth analysis of the errors current systems make using MQM-DQF standard error typology. We found that Mistranslation and Omission are the most frequent translation issues. We believe this work gives a methodology for evaluating other machine translation systems for low resource languages and we provide practical suggestions to improve current Tigrinya - English MT systems
0 Replies
