Deep Character-Level Neural Machine Translation By Learning Morphology

Shenjian Zhao, Zhihua Zhang

Nov 04, 2016 (modified: Dec 22, 2016) ICLR 2017 conference submission readers: everyone
  • Abstract: Neural machine translation aims at building a single large neural network that can be trained to maximize translation performance. The encoder-decoder architecture with an attention mechanism achieves a translation performance comparable to the existing state-of-the-art phrase-based systems. However, the use of large vocabulary becomes the bottleneck in both training and improving the performance. In this paper, we propose a novel architecture which learns morphology by using two recurrent networks and a hierarchical decoder which translates at character level. This gives rise to a deep character-level model consisting of six recurrent networks. Such a deep model has two major advantages. It avoids the large vocabulary issue radically; at the same time, it is more efficient in training than word-based models. Our model obtains a higher BLEU score than the bpe-based model after training for one epoch on En-Fr and En-Cs translation tasks. Further analyses show that our model is able to learn morphology.
  • TL;DR: We devise a character-level neural machine translation built on six recurrent networks, and obtain a BLEU score comparable to the state-of-the-art NMT on En-Fr and Cs-En translation tasks.
  • Conflicts: sjtu.edu.cn, pku.edu.cn, zju.edu.cn, ust.hk
  • Keywords: Natural language processing, Deep learning

Loading