- Abstract: Neural architecture search (NAS) has made rapid progress incomputervision,wherebynewstate-of-the-artresultshave beenachievedinaseriesoftaskswithautomaticallysearched neural network (NN) architectures. In contrast, NAS has not made comparable advances in natural language understanding (NLU). Corresponding to encoder-aggregator meta architecture of typical neural networks models for NLU tasks (Gong et al. 2018), we re-deﬁne the search space, by splittingitinto twoparts:encodersearchspace,andaggregator search space. Encoder search space contains basic operations such as convolutions, RNNs, multi-head attention and its sparse variants, star-transformers. Dynamic routing is included in the aggregator search space, along with max (avg) pooling and self-attention pooling. Our search algorithm is then fulﬁlled via DARTS, a differentiable neural architecture search framework. We progressively reduce the search space every few epochs, which further reduces the search time and resource costs. Experiments on ﬁve benchmark data-sets show that, the new neural networks we generate can achieve performances comparable to the state-of-the-art models that does not involve language model pre-training.
- Keywords: NAS, natural language understanding
- TL;DR: Neural Architecture Search for a series of Natural Language Understanding tasks. Design the search space for NLU tasks. And Apply differentiable architecture search to discover new models