Keywords: ProcGen, self-referential weight matrix, fast weight programmers, linear Transformers
TL;DR: We propose a scalable self-referential weight matrix that uses outer products and the delta update rule to modify itself.
Abstract: The weight matrix (WM) of a neural network (NN) is its program. The programs of many traditional NNs are learned through gradient descent in some error function, then remain fixed. The WM or program of a self-referential NN, however, can keep rapidly modifying all of itself during runtime. In principle, such NNs can meta-learn to learn, and meta-meta-learn to meta-learn to learn, and so on, in the sense of recursive self-improvement. Here we revisit such NNs, building upon recent successes of fast weight programmers (FWPs) and closely related linear Transformers. We propose a scalable self-referential WM (SRWM) that uses self-generated training patterns, outer products and the delta update rule to modify itself. We evaluate our SRWM in a multi-task reinforcement learning setting with procedurally generated ProcGen game environments. Our experiments demonstrate both practical applicability and competitive performance of the SRWM. Our code is public.
Supplementary Material: zip
Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 1 code implementation](https://www.catalyzex.com/paper/arxiv:2202.05780/code)