Response Modeling of Hyper-Parameters for Deep Convolutional Neural NetworksDownload PDF

28 Sept 2020 (modified: 05 May 2023)ICLR 2021 Conference Blind SubmissionReaders: Everyone
Keywords: Hyper-Parameter Optimization, Response Surface Modeling, Convolution Neural Network, Low-Rank Factorization
Abstract: Hyper-parameter optimization (HPO) is critical in training high performing Deep Neural Networks (DNN). Current methodologies fail to define an analytical response surface and remain a training bottleneck due to their use of additional internal hyper-parameters and lengthy evaluation cycles. We demonstrate that the low-rank factorization of the convolution weights of intermediate layers of a CNN can define an analytical response surface. We quantify how this surface acts as an auxiliary to optimizing training metrics. We introduce a dynamic tracking algorithm -- autoHyper -- that performs HPO on the order of hours for various datasets including ImageNet and requires no manual tuning. Our method -- using a single RTX2080Ti -- is able to select a learning rate within 59 hours for AdaM on ResNet34 applied to ImageNet and improves in testing accuracy by 4.93% over the default learning rate. In contrast to previous methods, we empirically prove that our algorithm and response surface generalize well across model, optimizer, and dataset selection removing the need for extensive domain knowledge to achieve high levels of performance.
One-sentence Summary: A new response surface model is proposed to dynamically track the optimum Hyper-Parameters for training Convolution Neural Network.
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics
Supplementary Material: zip
Reviewed Version (pdf): https://openreview.net/references/pdf?id=4l0OM5Og1n
15 Replies

Loading