Deep Neural Machine Translation Model Based on Simple Recurrent Units

Wen Zhang, Yang Feng, Qun Liu

11 May 2020 (modified: 16 Dec 2021)OpenReview Archive Direct UploadReaders: Everyone

Abstract: Attention-based neural machine translation models have become extremely popular, with an encoder-decoder framework to model translation as a sequence to sequence problem. In this paper, we replace the gated recurrent units in the classical encoder and decoder with the simple recurrent units (SRUs), and deepen the structure of the encoder and decoder by stacking network layers to improve the performance of neural machine translation model. We conducted experiments on the German-English and Uyghur-Chinese translation tasks. Experiment results show that the performance is significantly improved without extra training speed, especially with residual connections.

0 Replies