Simple and efficient architecture search for Convolutional Neural NetworksDownload PDF

15 Feb 2018 (modified: 07 Apr 2024)ICLR 2018 Conference Blind SubmissionReaders: Everyone
Abstract: Neural networks have recently had a lot of success for many tasks. However, neural network architectures that perform well are still typically designed manually by experts in a cumbersome trial-and-error process. We propose a new method to automatically search for well-performing CNN architectures based on a simple hill climbing procedure whose operators apply network morphisms, followed by short optimization runs by cosine annealing. Surprisingly, this simple method yields competitive results, despite only requiring resources in the same order of magnitude as training a single network. E.g., on CIFAR-10, our method designs and trains networks with an error rate below 6% in only 12 hours on a single GPU; training for one day reduces this error further, to almost 5%.
TL;DR: We propose a simple and efficent method for architecture search for convolutional neural networks.
Keywords: Deep Learning, Hyperparameter Optimization, Architecture Search, Convolutional Neural Networks, Network Morphism, Network Transformation, SGDR, Cosine annealing, hill climbing
Code: [![Papers with Code](/images/pwc_icon.svg) 3 community implementations](https://paperswithcode.com/paper/?openreview=SySaJ0xCZ)
Data: [CIFAR-10](https://paperswithcode.com/dataset/cifar-10), [CIFAR-100](https://paperswithcode.com/dataset/cifar-100)
Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 2 code implementations](https://www.catalyzex.com/paper/arxiv:1711.04528/code)
8 Replies

Loading