Keywords: optimizer benchmarking
TL;DR: We propose a new benchmarking framework to evaluate various optimizers.
Abstract: Many optimizers have been proposed for training deep neural networks, and they often have multiple hyperparameters, which make it tricky to benchmark their performance. In this work, we propose a new benchmarking protocol to evaluate both end-to-end efficiency (training a model from scratch without knowing the best hyperparameter configuration) and data-addition training efficiency (the previously selected hyperparameters are used for periodically re-training the model with newly collected data). For end-to-end efficiency, unlike previous work that assumes random hyperparameter tuning, which may over-emphasize the tuning time, we propose to evaluate with a bandit hyperparameter tuning strategy.
Supplementary Material: zip
4 Replies
Loading