Hyperion: Fused Multi-Trial and Gradient Descent for Joint Hyperparameter and Neural Architecture Optimization

22 Sept 2023 (modified: 11 Feb 2024)Submitted to ICLR 2024EveryoneRevisionsBibTeX
Primary Area: optimization
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Keywords: AutoML, Hyperparameter Optimization, Neural Architecture Search
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.
TL;DR: We consider the fusion of multi-trial optimizers and gradient descent based oneshot algorithms to jointly optimize neural network hyperparameters and architectures.
Abstract: We consider the fusion of multi-trial optimizers and gradient descent based oneshot algorithms to jointly optimize neural network hyperparameters and architectures. To combine strengths of optimizers from both categories, we propose Hyperion, which smartly distributes searched parameters into different involved optimizers, efficiently samples sub-search-spaces to reduce exploration costs of one-shot algorithms, and orchestrates co-optimization of both hyperparameters and network architectures. We demonstrate with open and industrial datasets that Hyperion outperforms non-fused optimization algorithms in optimized metrics, while significantly reducing GPU resources required for one-shot algorithms.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 5869
Loading