Fast and Accurate Transformer-based Translation with Character-Level Encoding and Subword-Level DecodingDownload PDF

Anonymous

16 Nov 2021 (modified: 05 May 2023)ACL ARR 2021 November Blind SubmissionReaders: Everyone
Abstract: The Transformer translation model is fast to train and achieves state-of-the-art results for various translation tasks. However, unknown input words at test time remain a challenge for the Transformer, especially when unknown words are segmented into inappropriate subword sequences that break morpheme boundaries. This paper improves the Transformer model to learn more accurate source representations via character-level encoding. We simply adopt character sequences instead of subword sequences as input of the standard Transformer encoder and propose contextualized character embedding (CCEmb) to help character-level encoding. Our CCEmb contains information about the current character and its context by adding the embeddings of its contextual character $n$-grams. The CCEmb causes little extra computational cost and we show that our model with a character-level encoder and a standard subword-level Transformer decoder can outperform the original pure subword-level Transformer, especially for translating source sentences that contain unknown (or rare) words.
0 Replies

Loading