Leveraging Cross-Lingual Knowledge from Pre-Trained Models for Low-Resource Neural Machine Translation

ACL ARR 2024 June Submission5546 Authors

16 Jun 2024 (modified: 03 Jul 2024)ACL ARR 2024 June SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Abstract: Neural machine translation (NMT) quality significantly depends on large parallel corpora, making low-resource language translation a challenge. This paper introduces a novel approach that leverages cross-lingual alignment knowledge from multilingual pre-trained language models (PLMs) to enhance low-resource NMT. Our method segments the translation model into source encoding, target encoding, and alignment modules, each initialized with different pre-trained BERT models. Experiments on four translation directions with two low-resource language pairs demonstrate significant BLEU score improvements, validating the efficacy of our approach.
Paper Type: Short
Research Area: Machine Translation
Research Area Keywords: Low-Resource, Pre-training for MT
Contribution Types: Approaches to low-resource settings
Languages Studied: English, Norwegian, German
Submission Number: 5546
Loading