Pivot Pre-finetuning for Low Resource MT: A Case Study in Kikamba.Download PDF

01 Mar 2023 (modified: 30 May 2023)Submitted to Tiny Papers @ ICLR 2023Readers: Everyone
Keywords: machine translation, low resource, african languages.
Abstract: Current approaches to performant machine translation often require large amounts of data (Koehn et al., 2022). However for a majority of 7000+ languages in the world, these languages often have a relative lack of digitized/organized text available, and are considered low-resource. In practical terms, this often means that there is a substantial drop in quality between translation performance between high and low-resource language pairs. We look to explore the intersection of rapid NMT adaptation techniques and pre-trained sequence to sequence models to better leverage multilingual models, performing a case study on Kikamba.
5 Replies

Loading