Modeling Context With Linear Attention for Scalable Document-Level Translation

Anonymous

Modeling Context With Linear Attention for Scalable Document-Level Translation

Anonymous

16 Oct 2021 (modified: 05 May 2023)ACL ARR 2021 October Blind SubmissionReaders: Everyone

Abstract: Document-level neural machine translation allows models to leverage dependencies beyond sentence-internal context to produce more coherent and consistent translations. However, these models, predominantly based on transformers, are difficult to scale to long documents due to the quadratic time and space complexity of their self-attention layers. Recent efforts on efficient attention variants improve scalability, but it is yet unclear if and to what extent their inductive biases are suitable for document translation. In this paper, we explore the efficacy of a recent linear attention model by Peng et al. (2021) on document-level translation and augment it with a sentential gating mechanism. We evaluate the model on the IWSLT 2015 and OpenSubtitles 2018 datasets against a strong transformer baseline and achieve up to 40% decoding speedup with similar or improved BLEU scores. We show that the sentential gate further improves translation quality on IWSLT, a dataset with long sequences.

0 Replies

Loading