Training and Generating Neural Networks in Compressed Weight Space

Kazuki Irie; Jürgen Schmidhuber

Training and Generating Neural Networks in Compressed Weight Space

Kazuki Irie, Jürgen Schmidhuber

Published: 01 Apr 2021, Last Modified: 22 Jun 2025Neural Compression Workshop @ ICLR 2021Readers: Everyone

Keywords: discrete cosine transform, recurrent neural networks, fast weights, language modelling

TL;DR: We revisit RNNs with DCT-encoded weight matrices and its fast weight extension for language modelling.

Abstract: The inputs and/or outputs of some neural nets are weight matrices of other neural nets. Indirect encodings or end-to-end compression of weight matrices could help to scale such approaches. Our goal is to open a discussion on this topic, starting with recurrent neural networks for character-level language modelling whose weight matrices are encoded by the discrete cosine transform. Our fast weight version thereof uses a recurrent neural network to parameterise the compressed weights. We present experimental results on the enwik8 dataset.

Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 1 code implementation](https://www.catalyzex.com/paper/training-and-generating-neural-networks-in/code)

1 Reply

Loading