Open Peer Review. Open Publishing. Open Access. Open Discussion. Open Directory. Open Recommendations. Open API. Open Source.
Opening the vocabulary of neural language models with character-level word representations
Matthieu Labeau, Alexandre Allauzen
Nov 04, 2016 (modified: Dec 12, 2016)ICLR 2017 conference submissionreaders: everyone
Abstract:This paper introduces an architecture for an open-vocabulary neural language model. Word representations are computed on-the-fly by a convolution network followed by pooling layer. This allows the model to consider any word, in the context or for the prediction. The training objective is derived from the Noise-Contrastive Estimation to circumvent the lack of vocabulary. We test the ability of our model to build representations of unknown words on the MT task of IWSLT-2016 from English to Czech, in a reranking setting. Experimental results show promising results, with a gain up to 0.7 BLEU point. They also emphasize the difficulty and instability when training such models with character-based representations for the predicted words.
Keywords:Natural language processing, Deep learning
Enter your feedback below and we'll get back to you as soon as possible.