A Gradient-based Architecture HyperParameter Optimization Approach

Zechun Liu; Xiangyu Zhang; Zhe Li; Yichen Wei; Kwang-Ting Cheng; Jian Sun

A Gradient-based Architecture HyperParameter Optimization Approach

Zechun Liu, Xiangyu Zhang, Zhe Li, Yichen Wei, Kwang-Ting Cheng, Jian Sun

25 Sept 2019 (modified: 05 May 2023)ICLR 2020 Conference Withdrawn SubmissionReaders: Everyone

Abstract: Network hyperparameters, such as network depth, layer-wise channel numbers, and input image resolution, are crucial for designing high-performance neural network architectures under resource limited scenarios. Previous solutions either optimize these hyperparameters with customized algorithms, or enumerate the hyperparameters with confined choices. Those methods are laborious and cumbersome to obtain a good solution. In this work, we propose a gradient-based approach to optimize these parameters in an efficient and unified manner, based on the observation that these parameters are consecutive and network performance changes continuously with them. Specifically, natural evolutionary strategy (NES) is used to approximate the gradient of the non-differentiable architecture hyperparameters and we incorporate it into the gradient descent framework for joint optimizing the weights and architecture hyperparameters. Compared to the state-of-the-art method, ChamNet, our method achieves higher accuracy with much fewer optimization time cost. Our method easily surpasses state-of-the-art methods and achieves up to 9.1%/6.1% accuracy enhancement than compact models MobileNet v1/v2.

Keywords: gradient-based, neural architecture search, architecture hyperparameter optimization

Original Pdf: pdf

4 Replies

Loading