Open Peer Review. Open Publishing. Open Access. Open Discussion. Open Directory. Open Recommendations. Open API. Open Source.
More systematic than claimed: Insights on the SCAN tasks
Markus Kliegl, Wei Xu
Feb 08, 2018 (modified: Feb 08, 2018)ICLR 2018 Workshop Submissionreaders: everyone
Abstract:We show that some standard attention-based architectures widely used in Neural Machine
Translation as well as a pointer-based variant achieve results on some of the compositional
SCAN tasks that are far superior to those reported in Lake & Baroni (2018). We next show that
there is high variance in the test accuracy across both random initialization and training
duration. We show that ensembling can be used to take advantage of this variance and improve
results but that, for many tasks, a large gap remains between ensemble performance and the
performance of an oracularly selected single best model. Based on these insights, we suggest
some possible directions for future research, emphasizing selection and regularization over the
need for more compositional architectures.
TL;DR:We show NMT models do better than claimed on the SCAN tasks, but the high variance will require new techniques.