Hybrid-Regressive Paradigm for Accurate and Speed-Robust Neural Machine Translation
Abstract: This work empirically confirms that non-autoregressive translation (NAT) is less robust in decoding batch size and hardware settings than autoregressive translation (AT). To address this issue, we demonstrate that prompting a small number of AT predictions can significantly reduce the performance gap between AT and NAT through synthetic experiments. Following this line, we propose hybrid-regressive translation (HRT), a two-stage translation prototype that combines the strengths of AT and NAT. Specifically, HRT first generates discontinuous sequences via autoregression (eg, make a prediction for every k tokens, k> 1) and then fills in all previously skipped tokens at once in a non-autoregressive manner. Experiments on five translation tasks show that HRT achieves comparable translation quality with AT while having at least 1.5 x faster inference regardless of batch size and device. Additionally, HRT successfully inherits the sound characteristics of AT in the deep-encoder-shallow-decoder architecture, allowing for further speedup without BLEU loss.
0 Replies
Loading