Improving Translation between Spanish and Mapudungun through Transfer Learning

Improving Translation between Spanish and Mapudungun through Transfer Learning

ACL ARR 2024 June Submission4966 Authors

16 Jun 2024 (modified: 22 Jul 2024)ACL ARR 2024 June SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Abstract: Neural Machine Translation (NMT) systems for lower-resource languages like Mapudungun face significant challenges due to limited training data and linguistic complexities. This project aims to improve translation between Spanish and Mapudungun through transfer learning, leveraging pre-trained models on Spanish-English and Spanish-Finnish language pairs. Our contributions include demonstrating the effectiveness of transfer learning in this context and providing a comparative analysis of different parent models. Our main findings show that transfer learning enhances translation performance, with not much of a difference between the Spanish-English and Spanish-Finnish pre-trained model performance. This suggests that factors beyond morphological similarity, such as data quality or tokenization methods, play a crucial role in transfer learning success. These insights hope to pave the way for future research into optimizing translation tools for low-resource languages and involving communities in the development process.

Paper Type: Long

Research Area: Multilingualism and Cross-Lingual NLP

Research Area Keywords: cross-lingual transfer, less-resourced languages, endangered languages, indigenous languages, resources for less-resourced languages

Contribution Types: NLP engineering experiment, Reproduction study, Approaches to low-resource settings, Publicly available software and/or pre-trained models

Languages Studied: Spanish, Mapudungun

Submission Number: 4966

Loading