Deep Character-Level Neural Machine Translation By Learning MorphologyDownload PDF

19 Apr 2024 (modified: 21 Jul 2022)Submitted to ICLR 2017Readers: Everyone
Abstract: Neural machine translation aims at building a single large neural network that can be trained to maximize translation performance. The encoder-decoder architecture with an attention mechanism achieves a translation performance comparable to the existing state-of-the-art phrase-based systems. However, the use of large vocabulary becomes the bottleneck in both training and improving the performance. In this paper, we propose a novel architecture which learns morphology by using two recurrent networks and a hierarchical decoder which translates at character level. This gives rise to a deep character-level model consisting of six recurrent networks. Such a deep model has two major advantages. It avoids the large vocabulary issue radically; at the same time, it is more efficient in training than word-based models. Our model obtains a higher BLEU score than the bpe-based model after training for one epoch on En-Fr and En-Cs translation tasks. Further analyses show that our model is able to learn morphology.
TL;DR: We devise a character-level neural machine translation built on six recurrent networks, and obtain a BLEU score comparable to the state-of-the-art NMT on En-Fr and Cs-En translation tasks.
Conflicts: sjtu.edu.cn, pku.edu.cn, zju.edu.cn, ust.hk
Keywords: Natural language processing, Deep learning
19 Replies

Loading