Opening the vocabulary of neural language models with character-level word representationsDownload PDF

29 Nov 2024 (modified: 21 Jul 2022)Submitted to ICLR 2017Readers: Everyone
Abstract: This paper introduces an architecture for an open-vocabulary neural language model. Word representations are computed on-the-fly by a convolution network followed by pooling layer. This allows the model to consider any word, in the context or for the prediction. The training objective is derived from the Noise-Contrastive Estimation to circumvent the lack of vocabulary. We test the ability of our model to build representations of unknown words on the MT task of IWSLT-2016 from English to Czech, in a reranking setting. Experimental results show promising results, with a gain up to 0.7 BLEU point. They also emphasize the difficulty and instability when training such models with character-based representations for the predicted words.
Conflicts: limsi.fr
Keywords: Natural language processing, Deep learning
20 Replies

Loading