Abstract: We improve previous end-to-end differentiable neural networks (NNs) with fast
weight memories. A gate mechanism updates fast weights at every time step of
a sequence through two separate outer-product-based matrices generated by slow
parts of the net. The system is trained on a complex sequence to sequence variation
of the Associative Retrieval Problem with roughly 70 times more temporal
memory (i.e. time-varying variables) than similar-sized standard recurrent NNs
(RNNs). In terms of accuracy and number of parameters, our architecture outperforms
a variety of RNNs, including Long Short-Term Memory, Hypernetworks,
and related fast weight architectures.
TL;DR: An improved Fast Weight network which shows better results on a general toy task.
Keywords: fast weights, RNN, associative retrieval, time-varying variables
4 Replies
Loading