EXPLORING NEURAL ARCHITECTURE SEARCH FOR LANGUAGE TASKS

Minh-Thang Luong; David Dohan; Adams Wei Yu; Quoc V. Le; Barret Zoph; Vijay Vasudevan

EXPLORING NEURAL ARCHITECTURE SEARCH FOR LANGUAGE TASKS

Minh-Thang Luong, David Dohan, Adams Wei Yu, Quoc V. Le, Barret Zoph, Vijay Vasudevan

15 Feb 2018 (modified: 10 Feb 2022)ICLR 2018 Conference Blind SubmissionReaders: Everyone

Abstract: Neural architecture search (NAS), the task of finding neural architectures automatically, has recently emerged as a promising approach for unveiling better models over human-designed ones. However, most success stories are for vision tasks and have been quite limited for text, except for a small language modeling setup. In this paper, we explore NAS for text sequences at scale, by first focusing on the task of language translation and later extending to reading comprehension. From a standard sequence-to-sequence models for translation, we conduct extensive searches over the recurrent cells and attention similarity functions across two translation tasks, IWSLT English-Vietnamese and WMT German-English. We report challenges in performing cell searches as well as demonstrate initial success on attention searches with translation improvements over strong baselines. In addition, we show that results on attention searches are transferable to reading comprehension on the SQuAD dataset.

TL;DR: We explore neural architecture search for language tasks. Recurrent cell search is challenging for NMT, but attention mechanism search works. The result of attention search on translation is transferable to reading comprehension.

Keywords: Neural architecture search, language tasks, neural machine translation, reading comprehension, SQuAD

Data: [SQuAD](https://paperswithcode.com/dataset/squad)

6 Replies

Loading