Leveraging Cross-Lingual Knowledge from Pre-Trained Models for Low-Resource Neural Machine Translation
Abstract: Neural machine translation (NMT) quality significantly depends on large parallel corpora, making low-resource language translation a challenge. This paper introduces a novel approach that leverages cross-lingual alignment knowledge from multilingual pre-trained language models (PLMs) to enhance low-resource NMT. Our method segments the translation model into source encoding, target encoding, and alignment modules, each initialized with different pre-trained BERT models. Experiments on four translation directions with two low-resource language pairs demonstrate significant BLEU score improvements, validating the efficacy of our approach.
Paper Type: Short
Research Area: Machine Translation
Research Area Keywords: Low-Resource, Pre-training for MT
Contribution Types: Approaches to low-resource settings
Languages Studied: English, Norwegian, German
Submission Number: 5546
Loading