Abstract: Hyperparameters of deep neural networks are often optimized by grid search, random search or Bayesian optimization.
As an alternative, we propose to use the Covariance Matrix Adaptation Evolution Strategy (CMA-ES), which is known for its state-of-the-art performance in derivative-free optimization. CMA-ES has some useful invariance properties and is friendly to parallel evaluations of solutions. We provide a toy usage example using CMA-ES to tune hyperparameters of a convolutional neural network for the MNIST dataset on 30 GPUs in parallel.
Conflicts: uni-freiburg.de, inria.fr, epfl.ch
10 Replies
Loading