Deep Character-Level Neural Machine Translation By Learning Morphology

Shenjian Zhao; Zhihua Zhang

Deep Character-Level Neural Machine Translation By Learning Morphology

Shenjian Zhao, Zhihua Zhang

13 Jul 2025 (modified: 21 Jul 2022)Submitted to ICLR 2017Readers: Everyone

Abstract: Neural machine translation aims at building a single large neural network that can be trained to maximize translation performance. The encoder-decoder architecture with an attention mechanism achieves a translation performance comparable to the existing state-of-the-art phrase-based systems. However, the use of large vocabulary becomes the bottleneck in both training and improving the performance. In this paper, we propose a novel architecture which learns morphology by using two recurrent networks and a hierarchical decoder which translates at character level. This gives rise to a deep character-level model consisting of six recurrent networks. Such a deep model has two major advantages. It avoids the large vocabulary issue radically; at the same time, it is more efficient in training than word-based models. Our model obtains a higher BLEU score than the bpe-based model after training for one epoch on En-Fr and En-Cs translation tasks. Further analyses show that our model is able to learn morphology.

TL;DR: We devise a character-level neural machine translation built on six recurrent networks, and obtain a BLEU score comparable to the state-of-the-art NMT on En-Fr and Cs-En translation tasks.

Conflicts: sjtu.edu.cn, pku.edu.cn, zju.edu.cn, ust.hk

Keywords: Natural language processing, Deep learning

19 Replies

Loading