Hyperion: Fused Multi-Trial and Gradient Descent for Joint Hyperparameter and Neural Architecture Optimization

Tanmay Goyal; Pengcheng Huang; Balz Maag

Hyperion: Fused Multi-Trial and Gradient Descent for Joint Hyperparameter and Neural Architecture Optimization

Tanmay Goyal, Pengcheng Huang, Balz Maag

22 Sept 2023 (modified: 11 Feb 2024)Submitted to ICLR 2024EveryoneRevisionsBibTeX

Primary Area: optimization

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.

Keywords: AutoML, Hyperparameter Optimization, Neural Architecture Search

Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.

TL;DR: We consider the fusion of multi-trial optimizers and gradient descent based oneshot algorithms to jointly optimize neural network hyperparameters and architectures.

Abstract: We consider the fusion of multi-trial optimizers and gradient descent based oneshot algorithms to jointly optimize neural network hyperparameters and architectures. To combine strengths of optimizers from both categories, we propose Hyperion, which smartly distributes searched parameters into different involved optimizers, efficiently samples sub-search-spaces to reduce exploration costs of one-shot algorithms, and orchestrates co-optimization of both hyperparameters and network architectures. We demonstrate with open and industrial datasets that Hyperion outperforms non-fused optimization algorithms in optimized metrics, while significantly reducing GPU resources required for one-shot algorithms.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Submission Number: 5869

Loading