OTOv3: Towards Automatic Sub-Network Search Within General Super Deep Neural Networks

Tianyi Chen; Luming Liang; Tianyu Ding; Ilya Zharkov

OTOv3: Towards Automatic Sub-Network Search Within General Super Deep Neural Networks

Tianyi Chen, Luming Liang, Tianyu Ding, Ilya Zharkov

09 May 2023 (modified: 12 Dec 2023)Submitted to NeurIPS 2023EveryoneRevisionsBibTeX

Keywords: Neural architecture search, automl, structured sparsity, sparse optimization

Abstract: Existing neural architecture search (NAS) methods typically rely on pre-specified super deep neural networks (super-networks) with handcrafted search spaces beforehand. Such requirements make it challenging to extend them onto general scenarios without significant human expertise and manual intervention. To overcome the limitations, we propose the third generation of Only-Train-Once (OTOv3). OTOv3 is perhaps the first automated system that trains general super-networks and produces high-performing sub-networks in the one shot manner without pretraining and fine-tuning. Technologically, OTOv3 delivers three noticeable contributions to minimize human efforts: (i) automatic search space construction for general super-networks; (ii) a Hierarchical Half-Space Projected Gradient (H2SPG) that leverages the dependency graph to ensure the network validity during optimization and reliably produces a solution with both high performance and hierarchical group sparsity; and (iii) automatic sub-network construction based on the super-network and the H2SPG solution. Numerically, we demonstrate the effectiveness of OTOv3 on a variety of super-networks, including StackedUnets, SuperResNet, and DARTS, over benchmark datasets such as CIFAR10, Fashion-MNIST, ImageNet, STL-10, and SVNH. The sub-networks computed by OTOv3 achieve competitive even superior performance compared to the super-networks and other state-of-the-arts.

Supplementary Material: pdf

Submission Number: 5388

Loading