Challenges in Urdu Machine Translation

Anonymous

Challenges in Urdu Machine Translation

Anonymous

16 Oct 2023ACL ARR 2023 October Blind SubmissionReaders: Everyone

Abstract: Machine translation systems have witnessed significant advancements in various tasks, raising questions about their performance for low-resource languages, particularly those based on Indo-Aryan scripts like Urdu. This study delves into the challenges faced by machine translation systems when dealing with Urdu, a low-resource Indo-Aryan language. We conduct a comprehensive evaluation of three language models: GPT-3.5, a large language model; opus-mt-en-ur, a publicly available bilingual translation model; and IndicTrans2, a specialized translation model for Indian languages, particularly low-resource ones. Our results reveal that IndicTrans2 outperforms the other models, signifying its potential in handling low-resource language translation. Additionally, this study sheds light on the specific challenges encountered by models in Urdu translation, offering valuable insights for future improvements in the field of machine translation for low-resource Indo-Aryan languages.

Paper Type: short

Research Area: Machine Translation

Contribution Types: Approaches to low-resource settings

Languages Studied: English , Urdu

Consent To Share Submission Details: On behalf of all authors, we agree to the terms above to share our submission details.

0 Replies

Loading