Fast and Accurate Reading Comprehension Without Recurrent Networks


Nov 03, 2017 (modified: Nov 03, 2017) ICLR 2018 Conference Blind Submission readers: everyone Show Bibtex
  • Abstract: Current end-to-end machine reading and question answering (Q&A) models are primarily based on recurrent neural networks (RNNs) with attention. Despite their success, these models are often slow for both training and inference due to the sequential nature of RNNs. We propose a novel Q&A model that does not require recurrent networks yet achieves equivalent or better performance than existing models. Our model is simple in that it consists exclusively of attention and convolutions. We also propose a novel data augmentation technique by paraphrasing. It not only enhances the training examples but also diversifies the phrasing of the sentences, which results in immediate accuracy improvements. This technique is of independent interest because it can be readily applied to other natural language processing tasks. On the SQuAD dataset, our model is 3x to 13x faster in training and 4x to 9x faster in inference. Our single model achieves 82.2 F1 score on the development set, which is on par with best documented result of 81.8.
  • TL;DR: A simple architecture consisting of convolutions and attention achieves results on par with the best documented recurrent models.
  • Keywords: squad, stanford question answering dataset, reading comprehension, attention, text convolutions, question answering