Recycling sub-optimial Hyperparameter Optimization models to generate efficient Ensemble Deep LearningDownload PDF

Sep 28, 2020 (edited Aug 02, 2021)ICLR 2021 Conference Blind SubmissionReaders: Everyone
  • Reviewed Version (pdf): https://openreview.net/references/pdf?id=UrWRbEMWck
  • Keywords: Deep Learning, hyperparameter optimization, ensemble deep learning, multi-GPU
  • Abstract: Ensemble Deep Learning improves accuracy over a single model by combining predictions from multiple models. It has established itself to be the core strategy for tackling the most difficult problems, like winning Kaggle challenges. Due to the lack of consensus to design a successful deep learning ensemble, we introduce Hyperband-Dijkstra, a new workflow that automatically explores neural network designs with Hyperband and efficiently combines them with Dijkstra's algorithm. This workflow has the same training cost than standard Hyperband running except sub-optimal solutions are stored and are candidates to be selected in the ensemble selection step (recycling). Next, to predict on new data, the user gives to Dijkstra the maximum number of models wanted in the ensemble to control the tradeoff between accuracy and inference time. Hyperband is a very efficient algorithm allocating exponentially more resources to the most promising configurations. It is also capable to propose diverse models due to its pure-exploration nature, which allows Dijkstra algorithm with a smart combination of diverse models to achieve a strong variance and bias reduction. The exploding number of possible combinations generated by Hyperband increases the probability that Dijkstra finds an accurate combination which fits the dataset and generalizes on new data. The two experimentation on CIFAR100 and on our unbalanced microfossils dataset show that our new workflow generates an ensemble far more accurate than any other ensemble of any ResNet models from ResNet18 to ResNet152.
  • One-sentence Summary: We propose a new workflow that automatically explores neural network designs and efficiently combine them to generate ensemble Deep Learning far more accurate than any other ensemble of any ResNet models from ResNet18 to ResNet152.
  • Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics
5 Replies

Loading