- Abstract: Recurrent neural nets are widely used for predicting temporal data. Their inherent deep feedforward structure allows learning complex sequential patterns. It is believed that top-down feedback might be an important missing ingredient which in theory could help disambiguate similar patterns depending on broader context. In this paper, we introduce surprisal-driven recurrent networks, which take into account past error information when making new predictions. This is achieved by continuously monitoring the discrepancy between most recent predictions and the actual observations. Furthermore, we show that it outperforms other stochastic and fully deterministic approaches on enwik8 character level prediction task achieving 1.37 BPC.
- TL;DR: In this paper, we add surprisal as additional input to RNN , which take into account past error information when making new predictions. We extend SOTA on character-level language modelling, achieving 1.37 bits/char on wikipedia dataset.
- Conflicts: ibm.com
- Keywords: Unsupervised Learning, Applications, Deep learning