GTCOM Neural Machine Translation Systems for WMT20

Published: 01 Jan 2020, Last Modified: 21 May 2025WMT@EMNLP 2020EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: This paper describes the Global Tone Communication Co., Ltd.’s submission of the WMT20 shared news translation task. We participate in four directions: English to (Khmer and Pashto) and (Khmer and Pashto) to English. Further, we get the best BLEU scores in the directions of English to Pashto, Pashto to English and Khmer to English (13.1, 23.1 and 25.5 respectively) among all the participants. Our submitted systems are unconstrained and focus on mBART (Multilingual Bidirectional and Auto-Regressive Transformers), back-translation and forward-translation. Also, we apply rules, language model and RoBERTa model to filter monolingual, parallel sentences and synthetic sentences. Besides, we validate the difference of the vocabulary built from monolingual data and parallel data.
Loading

OpenReview is a long-term project to advance science through improved peer review with legal nonprofit status. We gratefully acknowledge the support of the OpenReview Sponsors. © 2025 OpenReview