Abstract: We propose an efficient online hyperparameter optimization method which uses a joint dynamical system to evaluate the gradient with respect to the hyperparameters. While similar methods are usually limited to hyperparameters with a smooth impact on the model, we show how to apply it to the probability of dropout in neural networks. Finally, we show its effectiveness on two distinct tasks.
TL;DR: An algorithm for optimizing regularization hyper-parameters during training
Keywords: hyper-parameters, optimization
5 Replies
Loading