Abstract: The architecture search methods for convolutional neural networks (CNNs) have shown promising results. These methods require significant computational resources, as they repeat the neural network training many times to evaluate and search the architectures. Developing the computationally efficient architecture search method is an important research topic. In this paper, we assume that the structure parameters of CNNs are categorical variables, such as types and connectivities of layers, and they are regarded as the learnable parameters. Introducing the multivariate categorical distribution as the underlying distribution for the structure parameters, we formulate a differentiable loss for the training task, where the training of the weights and the optimization of the parameters of the distribution for the structure parameters are coupled. They are trained using the stochastic gradient descent, leading to the optimization of the structure parameters within a single training. We apply the proposed method to search the architecture for two computer vision tasks: image classification and inpainting. The experimental results show that the proposed architecture search method is fast and can achieve comparable performance to the existing methods.
Keywords: architecture search, stochastic natural gradient, convolutional neural networks
TL;DR: We present an efficient neural network architecture search method based on stochastic natural gradient method via probabilistic modeling.
Data: [CelebA](https://paperswithcode.com/dataset/celeba), [SVHN](https://paperswithcode.com/dataset/svhn)
8 Replies
Loading