Analyzing Populations of Neural Networks via Dynamical Model EmbeddingDownload PDF

Published: 28 Jan 2022, Last Modified: 13 Feb 2023ICLR 2022 SubmittedReaders: Everyone
Keywords: dynamics, RNNs, model averaging, model clustering, CNNs, semi-supervised learning
Abstract: A core challenge in the interpretation of deep neural networks is identifying commonalities between the underlying algorithms implemented by distinct networks trained for the same task. Motivated by this problem, we introduce \textsc{Dynamo}, an algorithm that constructs low-dimensional manifolds where each point corresponds to a neural network model, and two points are nearby if the corresponding neural networks enact similar high-level computational processes. \textsc{Dynamo} takes as input a collection of pre-trained neural networks and outputs a \emph{meta-model} that emulates the dynamics of the hidden states as well as the outputs of any model in the collection. The specific model to be emulated is determined by a \emph{model embedding vector} that the meta-model takes as input; these model embedding vectors constitute a manifold corresponding to the given population of models. We apply \textsc{Dynamo} to both RNNs and CNNs, and find that the resulting model embedding manifolds enable novel applications: clustering of neural networks on the basis of their high-level computational processes in a manner that is less sensitive to reparameterization; model averaging of several neural networks trained on the same task to arrive at a new, operable neural network with similar task performance; and semi-supervised learning via optimization on the model embedding manifold. Using a fixed-point analysis of meta-models trained on populations of RNNs, we gain new insights into how similarities of the topology of RNN dynamics correspond to similarities of their high-level computational processes.
One-sentence Summary: We have introduced the algorithm DYNAMO, which maps a set of neural network base models to a low dimensional feature space which captures computational features of the networks.
15 Replies

Loading