Simple, Fast Noise-Contrastive Estimation for Large RNN VocabulariesDownload PDF

2016 (modified: 16 Jul 2019)HLT-NAACL 2016Readers: Everyone
Abstract: We present a simple algorithm to efficiently train language models with noise-contrastive estimation (NCE) on graphics processing units (GPUs). Our NCE-trained language models achieve significantly lower perplexity on the One Billion Word Benchmark language modeling challenge, and contain one sixth of the parameters in the best single model in Chelba et al. (2013). When incorporated into a strong Arabic-English machine translation system they give a strong boost in translation quality. We release a toolkit so that others may also train large-scale, large vocabulary LSTM language models with NCE, parallelizing computation across multiple GPUs.
0 Replies

Loading