Recurrent Neural Network based language modeling with controllable external Memory

Wei-Jen Ko, Bo-Hsiang Tseng, Hung-yi Lee

2017 (modified: 11 Nov 2021)ICASSP 2017Readers: Everyone

Abstract: It is crucial for language models to model long-term dependency in word sequences, which can be achieved to some good extent by recurrent neural network (RNN) based language models with long short-term memory (LSTM) units. To accurately model the sophisticated long-term information in human languages, large memory in language models is necessary. However, the size of RNN-based language models cannot be arbitrarily increased because the computational resources required and the model complexity will also be increase accordingly, due to the limitation of the structure. To overcome this problem, inspired from Neural Turing Machine and Memory Network, we equip RNN-based language models with controllable external memory. With a learnable memory controller, the size of the external memory is independent to the number of model parameters, so the proposed language model can have larger memory without increasing the parameters. In the experiments, the proposed model yielded lower perplexities than RNN-based language models with LSTM units on both English and Chinese corpora.

0 Replies