PA-NAS: Partial operation activation for memory-efficient architecture search

Huabin Diao, Gongyan Li, Shaoyun Xu, Yuexing Hao

2022 (modified: 02 Nov 2022)Appl. Intell. 2022Readers: Everyone

Abstract: Stochastic Neural Architecture Search (SNAS) provides an efficient solution for searching neural architectures, but suffers from large computation and memory burdens when training an over-parameterized network. In this paper, we propose a novel and effective approach, Partially-Activated NAS (PA-NAS), to alleviate the memory and computation overheads. Instead of relaxing the discrete operation selection using softened one-hot variables, we use hardened one-hot variable relaxation to prompt partial operation activation, which not only reduces overheads significantly, but makes them insensitive to the operation number. In particular, this strategy enables the performance of the converged parent networks to be consistent with the derived child networks. Besides, we introduce a feedback mechanism as a regularizer in NAS to decrease the tendency of falling into local optimum. With the alleviated overheads of memory and computation, PA-NAS can be trained in a larger batch size and, consequently, enjoy both higher training stability and faster running speed. Experimental results on CIFAR10, CIFAR100 and ImageNet demonstrate the effectiveness of our proposed method. Compared with SNAS, our method reduces the search costs by a factor of 8 times without any performance degradation. The more the number of operations used, the larger the reduction in search costs is obtained. Specifically, we achieve an error rate of 3.15% with 0.444M parameters and 2.96% with 1.647M on CIFAR10 in only 0.2 GPU days. On Cifar100, the error rate is 23.31% with only 0.49M parameters and 20.14% with 6.938M. Furthermore, an error rate of 46.6% can be accomplished with only 0.584M parameters and 42.75% with only 8.077M on ImageNet-160-120.

0 Replies