Neural Machine Translation with an Awareness of Semantic Similarity

Published: 01 Jan 2023, Last Modified: 08 Nov 2024PRICAI (2) 2023EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Machine translation requires that source and target sentences have identical semantics. Previous neural machine translation (NMT) models have implicitly achieved this requirement using cross-entropy loss. In this paper, we propose a sentence Semantic-aware Machine Translation model (SaMT) which explicitly addresses the issue of semantic similarity between sentences in translation. SaMT integrates a Sentence-Transformer into a Transformer-based encoder-decoder to estimate semantic similarity between source and target sentences. Our model enables translated sentences to maintain the semantics of source sentences, either by using the Sentence-Transformer alone or by including an additional linear layer in the decoder. To achieve high-quality translation, we employ vertical and horizontal feature fusion methods, which capture rich features from sentences during translation. Experimental results showed a BLEU score of 36.41 on the IWSLT2014 \(German \rightarrow English\) dataset, validating the efficacy of incorporating sentence-level semantic knowledge and using the two orthogonal fusion methods. Our code is available at https://github.com/aaa559/SaMT-master.
Loading