Open Peer Review. Open Publishing. Open Access. Open Discussion. Open Directory. Open Recommendations. Open API. Open Source.
EXPLORING NEURAL ARCHITECTURE SEARCH FOR LANGUAGE TASKS
Minh-Thang Luong, David Dohan, Adams Wei Yu, Quoc V. Le, Barret Zoph, Vijay Vasudevan
Feb 15, 2018 (modified: Feb 15, 2018)ICLR 2018 Conference Blind Submissionreaders: everyoneShow Bibtex
Abstract:Neural architecture search (NAS), the task of finding neural architectures automatically, has recently emerged as a promising approach for unveiling better models over human-designed ones. However, most success stories are for vision tasks and have been quite limited for text, except for a small language modeling setup. In this paper, we explore NAS for text sequences at scale, by first focusing on the task of language translation and later extending to reading comprehension. From a standard sequence-to-sequence models for translation, we conduct extensive searches over the recurrent cells and attention similarity functions across two translation tasks, IWSLT English-Vietnamese and WMT German-English. We report challenges in performing cell searches as well as demonstrate initial success on attention searches with translation improvements over strong baselines. In addition, we show that results on attention searches are transferable to reading comprehension on the SQuAD dataset.
TL;DR:We explore neural architecture search for language tasks. Recurrent cell search is challenging for NMT, but attention mechanism search works. The result of attention search on translation is transferable to reading comprehension.