What Is in a Translation Unit?  Comparing Character and Subword Representations Beyond Translation

Nadir Durrani; Fahim Dalvi; Hassan Sajjad; Yonatan Belinkov; Preslav Nakov

What Is in a Translation Unit? Comparing Character and Subword Representations Beyond Translation

Nadir Durrani, Fahim Dalvi, Hassan Sajjad, Yonatan Belinkov, Preslav Nakov

27 Sept 2018 (modified: 05 May 2023)ICLR 2019 Conference Withdrawn SubmissionReaders: Everyone

Abstract: Recent work has shown that contextualized word representations derived from neural machine translation (NMT) are a viable alternative to such from simple word predictions tasks. This is because the internal understanding that needs to be built in order to be able to translate from one language to another is much more comprehensive. Unfortunately, computational and memory limitations as of present prevent NMT models from using large word vocabularies, and thus alternatives such as subword units (BPE and morphological segmentations) and characters have been used. Here we study the impact of using different kinds of units on the quality of the resulting representations when used to model syntax, semantics, and morphology. We found that while representations derived from subwords are slightly better for modeling syntax, character-based representations are superior for modeling morphology and are also more robust to noisy input.

Keywords: subwords, representations, word embeddings, transfer learning, machine translation, natural language processing

TL;DR: We study the impact of using different kinds of subword units on the quality of the resulting representations when used to model syntax, semantics, and morphology.

4 Replies

Loading