Keywords: learnable neural architectures, differentiable neural architectures, continuous parameterizations, continuous relaxations, differentiable architecture search, differentiable masking
TL;DR: We propose DNArch, a method that jointly learns the weights and the entire architecture of a CNN by backpropagation, i.e., its width, its depth, its kernel sizes and the value and position of its downsampling layers.
Abstract: We present *Differentiable Neural Architectures* (DNArch), a method that learns the weights and the architecture of CNNs jointly by backpropagation. DNArch enables learning (*i*) the size of convolutional kernels, (*ii*) the width of all layers, (*iii*) the position and value of downsampling layers, and (*iv*) the depth of the network. DNArch treats neural architectures as continuous entities and uses learnable differentiable masks to control their size. Unlike existing methods, DNArch is not limited to a (small) predefined set of possible components, but instead it is able to discover CNN architectures across all feasible combinations of kernel sizes, widths, depths and downsampling. Empirically, DNArch finds effective architectures for classification and dense prediction tasks on sequential and image data. By adding a loss term that controls the network complexity, DNArch constrains its search to architectures that respect a predefined computational budget during training.
Submission Number: 5
Loading