Context Consistency between Training and Inference in Simultaneous Machine TranslationDownload PDF

Anonymous

16 Dec 2023ACL ARR 2023 December Blind SubmissionReaders: Everyone
Abstract: Simultaneous Machine Translation (SiMT) aims to yield a real-time partial translation with a monotonically growing source-side context. However, there is a counterintuitive phenomenon about the context usage between training and inference: {\em e.g.}, in wait-$k$ inference, model consistently trained with wait-$k$ is much worse than that model inconsistently trained with wait-$k'$ ($k'\neq k$) in terms of translation quality. To this end, we first investigate the underlying reasons behind this phenomenon and uncover the following two factors: 1) the limited correlation between translation quality and training (cross-entropy) loss; 2) exposure bias between training and inference. Based on both reasons, we then propose an effective training approach called context consistency training accordingly, which encourages consistent context usage between training and inference by optimizing translation quality and latency as bi-objectives and exposing the predictions to the model during the training. The experiments on three language pairs demonstrate our intuition: our system encouraging context consistency outperforms that existing systems with context inconsistency for the first time, with the help of our context consistency training approach.
Paper Type: long
Research Area: Machine Translation
Contribution Types: Model analysis & interpretability
Languages Studied: English, German, Vietnamese
0 Replies

Loading

OpenReview is a long-term project to advance science through improved peer review with legal nonprofit status. We gratefully acknowledge the support of the OpenReview Sponsors. © 2025 OpenReview