Opening the vocabulary of  neural language models with character-level word representations

Matthieu Labeau; Alexandre Allauzen

Opening the vocabulary of neural language models with character-level word representations

Matthieu Labeau, Alexandre Allauzen

07 Jul 2025 (modified: 21 Jul 2022)Submitted to ICLR 2017Readers: Everyone

Abstract: This paper introduces an architecture for an open-vocabulary neural language model. Word representations are computed on-the-fly by a convolution network followed by pooling layer. This allows the model to consider any word, in the context or for the prediction. The training objective is derived from the Noise-Contrastive Estimation to circumvent the lack of vocabulary. We test the ability of our model to build representations of unknown words on the MT task of IWSLT-2016 from English to Czech, in a reranking setting. Experimental results show promising results, with a gain up to 0.7 BLEU point. They also emphasize the difficulty and instability when training such models with character-based representations for the predicted words.

Conflicts: limsi.fr

Keywords: Natural language processing, Deep learning

20 Replies

Loading