Open Peer Review. Open Publishing. Open Access. Open Discussion. Open Directory. Open Recommendations. Open API. Open Source.
Multiplicative LSTM for sequence modelling
Ben Krause, Iain Murray, Steve Renals, Liang Lu
Feb 16, 2017 (modified: Feb 16, 2017)ICLR 2017 workshop submissionreaders: everyone
Abstract:We introduce multiplicative LSTM (mLSTM), a novel recurrent neural network
architecture for sequence modelling that combines the long short-term memory
(LSTM) and multiplicative recurrent neural network architectures. mLSTM is
characterised by its ability to have different recurrent transition functions for each
possible input, which we argue makes it more expressive for autoregressive density
estimation. We demonstrate empirically that mLSTM outperforms standard LSTM
and its deep variants for a range of character level modelling tasks, and that this
improvement increases with the complexity of the task. This model achieves a
test error of 1.19 bits/character on the last 4 million characters of the Hutter prize
dataset when combined with dynamic evaluation.
TL;DR:Combines LSTM and multiplicative RNN architectures; achieves 1.19 bits/character on Hutter prize dataset with dynamic evaluation.
Keywords:Deep learning, Unsupervised Learning, Natural language processing
Enter your feedback below and we'll get back to you as soon as possible.