NLIP_Lab-IITH Low-Resource MT System for WMT24 Indic MT Shared Task

Published: 01 Jan 2024, Last Modified: 09 Dec 2024WMT 2024EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: In this paper, we describe our system for the WMT 24 shared task of Low-Resource Indic Language Translation. We consider eng↔{as, kha, lus, mni} as participating language pairs. In this shared task, we explore the fine-tuning of a pre-trained model motivated by the pre-trained objective of aligning embeddings closer by alignment augmentation (Lin et al.,2020) for 22 scheduled Indian languages. Our primary system is based on language-specific finetuning on a pre-trained model. We achieve chrF2 scores of 50.6, 42.3, 54.9, and 66.3 on the official public test set for eng→as, eng→kha, eng→lus, eng→mni respectively. We also explore multilingual training with/without language grouping and layer-freezing.
Loading