A Gradient-based Architecture HyperParameter Optimization ApproachDownload PDF

25 Sep 2019 (modified: 24 Dec 2019)ICLR 2020 Conference Withdrawn SubmissionReaders: Everyone
  • Original Pdf: pdf
  • Abstract: Network hyperparameters, such as network depth, layer-wise channel numbers, and input image resolution, are crucial for designing high-performance neural network architectures under resource limited scenarios. Previous solutions either optimize these hyperparameters with customized algorithms, or enumerate the hyperparameters with confined choices. Those methods are laborious and cumbersome to obtain a good solution. In this work, we propose a gradient-based approach to optimize these parameters in an efficient and unified manner, based on the observation that these parameters are consecutive and network performance changes continuously with them. Specifically, natural evolutionary strategy (NES) is used to approximate the gradient of the non-differentiable architecture hyperparameters and we incorporate it into the gradient descent framework for joint optimizing the weights and architecture hyperparameters. Compared to the state-of-the-art method, ChamNet, our method achieves higher accuracy with much fewer optimization time cost. Our method easily surpasses state-of-the-art methods and achieves up to 9.1%/6.1% accuracy enhancement than compact models MobileNet v1/v2.
  • Keywords: gradient-based, neural architecture search, architecture hyperparameter optimization
4 Replies