Abstract: Monte-Carlo Tree Search (MCTS) is a powerful tool for many non-differentiable search related problems such as adversarial games. However, the performance of such approach highly depends on the order of the nodes that are considered at each branching of the tree. If the first branches cannot distinguish between promising and deceiving configurations for the final task, the efficiency of the search is significantly reduced. In Neural Architecture Search (NAS), as only the final architecture matters, the visiting order of the branching can be optimized to improve learning. In this paper, we study the application of MCTS to NAS for image classification. We analyze several sampling methods and branching alternatives for MCTS and propose to learn the branching by hierarchical clustering of architectures based on their similarity. The similarity is measured by the pairwise distance of output vectors of architectures. Extensive experiments on two challenging benchmarks on CIFAR10 and ImageNet show that MCTS, if provided with a good branching hierarchy, often yielding better solutions more efficiently than other approaches for NAS problems.
Submission Length: Regular submission (no more than 12 pages of main content)
Changes Since Last Submission: The revised portions are shown in blue color in the manuscript. The summary of the changes/additions are as follows:
* Addition of “Generalizable and training-free NAS” to related work
* Notation change for eq. 4
* Addition of more methods for comparison to table 4
* Revision of “Limitations” to discuss specific bottlenecks
* Addition of “Broader Impact Statement”
* Addition of more details in appendix B.1 to specify the procedural choices such as batch size, distance metric and others
* Several additional ablations in appendix C.2 as requested by reviewers including:
* Ablation on distance metric
* Smoothing factor
* Warm-up iterations
* Exploration parameter
* Clustering linkage
* Validation batch
* Temperature scheduling
* Proofreading and correcting typos and grammatical errors
Assigned Action Editor: ~Vincent_Tan1
Submission Number: 6121
Loading