Abstract: This paper introduces a novel optimization method for differential neural architecture search, based on the theory of prediction with expert advice. Its optimization criterion is well fitted for architecture-selection, i.e., it minimizes the regret implied by sub-optimal selection of operations. Unlike previous search relaxations that require hard pruning of architectures, our method is designed to dynamically wipe-out inferior architectures and enhance superior ones. It achieves optimal worst-case regret bound and suggests the use of multiple learning-rates, based on the amount of information carried by the backward gradients. Experiments show that our algorithm achieves strong performance on several image classification datasets. Specifically, it has an error rate of 1.60% on CIFAR-10 and achieves state-of-the-art results for three additional datasets.
Code Link: https://github.com/NivNayman/XNAS
CMT Num: 1164
0 Replies
Loading