Abstract: A neural architecture search (NAS) automatically designs the structure of a deep neural network in an exploratory manner that is conventionally designed by experts. A weight-sharing (WS)-based NAS is a time-efficient NAS approach because it simultaneously learns the neural network structure and weight parameters in a single training session. However, WS-based NAS has been observed to have a problem in that the search may converge to an architecture with a significantly low final performance, depending on the design of the search space, which is a set of possible combinations of operations. The design of an appropriate search space is task-dependent, and the user must carefully design it, hindering WS-based NAS applications. We conducted a simple case study on a synthetic regression task to analytically investigate the impact of statistics between operations in the search space on the optimal weight values. Based on the case study, we hypothesized that a strong negative covariance may lead to suboptimal weight values in the following layers, resulting in the selection of a suboptimal architecture. Our numerical experiments show that a simple modification to the aggregation operation in the search space helps mitigate the issue mentioned above and improves the robustness of WS-based NAS.
Loading