Enhancing Attention with Explicit Phrasal AlignmentsDownload PDF

25 Sept 2019, 19:18 (modified: 24 Dec 2019, 06:17)ICLR 2020 Conference Blind SubmissionReaders: Everyone
Original Pdf: pdf
Abstract: The attention mechanism is an indispensable component of any state-of-the-art neural machine translation system. However, existing attention methods are often token-based and ignore the importance of phrasal alignments, which are the backbone of phrase-based statistical machine translation. We propose a novel phrase-based attention method to model n-grams of tokens as the basic attention entities, and design multi-headed phrasal attentions within the Transformer architecture to perform token-to-token and token-to-phrase mappings. Our approach yields improvements in English-German, English-Russian and English-French translation tasks on the standard WMT'14 test set. Furthermore, our phrasal attention method shows improvements on the one-billion-word language modeling benchmark.
Keywords: NMT, Phrasal Attention, Machine Translation, Language Modeling
4 Replies