Stochastic Hyperparameter Optimization through Hypernetworks

Jonathan Lorraine; David Duvenaud

Stochastic Hyperparameter Optimization through Hypernetworks

Jonathan Lorraine, David Duvenaud

15 Feb 2018 (modified: 13 Apr 2025)ICLR 2018 Conference Blind SubmissionReaders: Everyone

Abstract: Machine learning models are usually tuned by nesting optimization of model weights inside the optimization of hyperparameters. We give a method to collapse this nested optimization into joint stochastic optimization of both weights and hyperparameters. Our method trains a neural network to output approximately optimal weights as a function of hyperparameters. We show that our method converges to locally optimal weights and hyperparameters for sufficiently large hypernets. We compare this method to standard hyperparameter optimization strategies and demonstrate its effectiveness for tuning thousands of hyperparameters.

TL;DR: We train a neural network to output approximately optimal weights as a function of hyperparameters.

Keywords: hypernetworks, hyperparameter optimization, metalearning, neural networks, Bayesian optimization, game theory, optimization

Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 1 code implementation](https://www.catalyzex.com/paper/stochastic-hyperparameter-optimization/code)

8 Replies

Loading