Keywords: Model Zoo, Population, Neural Networks, Model Analysis
TL;DR: To enable the investigation of populations of neural network models, we release a novel dataset of diverse model zoos with this work.
Abstract: In the last years, neural networks (NN) have evolved from laboratory environments to the state-of-the-art for many real-world problems. It was shown that NN models (i.e., their weights and biases) evolve on unique trajectories in weight space during training. Following, a population of such neural network models (referred to as model zoo) would form structures in weight space. We think that the geometry, curvature and smoothness of these structures contain information about the state of training and can reveal latent properties of individual models. With such model zoos, one could investigate novel approaches for (i) model analysis, (ii) discover unknown learning dynamics, (iii) learn rich representations of such populations, or (iv) exploit the model zoos for generative modelling of NN weights and biases. Unfortunately, the lack of standardized model zoos and available benchmarks significantly increases the friction for further research about populations of NNs. With this work, we publish a novel dataset of model zoos containing systematically generated and diverse populations of NN models for further research. In total the proposed model zoo dataset is based on eight image datasets, consists of 27 model zoos trained with varying hyperparameter combinations and includes 50’360 unique NN models as well as their sparsified twins, resulting in over 3’844’360 collected model states. Additionally, to the model zoo data we provide an in-depth analysis of the zoos and provide benchmarks for multiple downstream tasks. The dataset can be found at www.modelzoos.cc.
Supplementary Material: pdf
Dataset Url: A landing page can be found under www.modelzoos.cc. Raw and preprocessed datasets are linked there. Code to reproduce, alter or extend the zoos is available, as is code load and preprocess the raw zoos and to reproduce the benchmark results. The landing page will be continuously extended with further documentation and analysis. The datasets are hosted on Zenodo to ensure long term availability.
License: The dataset is publicly available and licensed under the Creative Commons Attribution 4.0 International license (CC-BY 4.0)
Author Statement: Yes
Contribution Process Agreement: Yes
In Person Attendance: Yes