An Empirical Study of Document-to-document Neural Machine Translation

Anonymous

An Empirical Study of Document-to-document Neural Machine Translation

Anonymous

17 Sept 2021 (modified: 05 May 2023)ACL ARR 2021 September Blind SubmissionReaders: Everyone

Abstract: This paper does not aim at introducing a novel method for document NMT. Instead, we head back to the original transformer model with document-level training and hope to answer the following question: Is the capacity of current models strong enough for document-level NMT? Interestingly, we observe that the original transformer with appropriate training techniques can achieve strong results for document translation, even with a length of 2000 words. We evaluate this model and several recent approaches on nine document-level datasets and two sentence-level datasets across six languages. Experiments show that the original Transformer model outperforms sentence-level models and many previous methods in a comprehensive set of metrics, including BLEU, four lexical indices, three newly proposed assistant linguistic indicators, and human evaluation.

0 Replies

Loading