Faster Discovery of Neural Architectures by Searching for Paths in a Large Model


Nov 03, 2017 (modified: Nov 03, 2017) ICLR 2018 Conference Blind Submission readers: everyone Show Bibtex
  • Abstract: We propose an approach for automatic model designing, which is significantly faster and less expensive than previous methods. In our method, which we name Efficient Neural Architecture Search (ENAS), a controller learns to discover neural architectures by searching for an optimal path within a larger predetermined model. The parameters of the predetermined model are trained to minimize a canonical loss function, such as the cross entropy, on the training dataset. The controller learns the path with policy gradient to maximize the expected reward on the validation set. In our experiments, ENAS achieves comparable test accuracy while being 10x faster and requiring 100x less resources than NAS. On the CIFAR-10 dataset, ENAS can design novel architectures that achieve the test error of 3.86%, compared to 3.41% by standard NAS (Zoph et al., 2017). On the Penn Treebank dataset, ENAS also discovers a novel architecture, which achieves the test perplexity of 64.6 compared to 62.4 by standard NAS.
  • Keywords: neural architecture search