Abstract: This paper presents our submissions to WMT 22 shared task in the Unsupervised and Very Low Resource Supervised Machine Translation tasks. The task revolves around translating between German ↔ Upper Sorbian (de ↔ hsb), German ↔ Lower Sorbian (de ↔ dsb) and Upper Sorbian ↔ Lower Sorbian (hsb ↔ dsb) in both unsupervised and supervised manner. For the unsupervised system, we trained an unsupervised phrase-based statistical machine translation (UPBSMT) system on each pair independently. We pretrained a De-Salvic mBART model on the following languages Polish (pl), Czech (cs), German (de), Upper Sorbian (hsb), Lower Sorbian (dsb). We then fine-tuned our mBART on the synthetic parallel data generated by the (UPBSMT) model along with authentic parallel data (de ↔ pl, de ↔ cs). We further fine-tuned our unsupervised system on authentic parallel data (hsb ↔ dsb, de ↔ dsb, de ↔ hsb) to submit our supervised low-resource system.
Loading