Opening the vocabulary of neural language models with character-level word representations

Matthieu Labeau, Alexandre Allauzen

Nov 04, 2016 (modified: Dec 12, 2016) ICLR 2017 conference submission readers: everyone
  • Abstract: This paper introduces an architecture for an open-vocabulary neural language model. Word representations are computed on-the-fly by a convolution network followed by pooling layer. This allows the model to consider any word, in the context or for the prediction. The training objective is derived from the Noise-Contrastive Estimation to circumvent the lack of vocabulary. We test the ability of our model to build representations of unknown words on the MT task of IWSLT-2016 from English to Czech, in a reranking setting. Experimental results show promising results, with a gain up to 0.7 BLEU point. They also emphasize the difficulty and instability when training such models with character-based representations for the predicted words.
  • Conflicts:
  • Keywords: Natural language processing, Deep learning