Abstract: Machine learning models are usually tuned by nesting optimization of model weights inside the optimization of hyperparameters. We give a method to collapse this nested optimization into joint stochastic optimization of both weights and hyperparameters. Our method trains a neural network to output approximately optimal weights as a function of hyperparameters. We show that our method converges to locally optimal weights and hyperparameters for sufficiently large hypernets. We compare this method to standard hyperparameter optimization strategies and demonstrate its effectiveness for tuning thousands of hyperparameters.
TL;DR: We train a neural network to output approximately optimal weights as a function of hyperparameters.
Keywords: hypernetworks, hyperparameter optimization, metalearning, neural networks, Bayesian optimization, game theory, optimization
Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 1 code implementation](https://www.catalyzex.com/paper/stochastic-hyperparameter-optimization/code)
8 Replies
Loading