Multi-step Retriever-Reader Interaction for Scalable Open-domain Question Answering

Rajarshi Das, Shehzaad Dhuliawala, Manzil Zaheer, Andrew McCallum

Sep 27, 2018 ICLR 2019 Conference Blind Submission readers: everyone Show Bibtex
  • Abstract: This paper introduces a new framework for open-domain question answering in which the retriever and the reader \emph{iteratively} interact with each other. The framework is agnostic to the architecture of the machine reading model provided it has \emph{access} to the token-level hidden representations of the reader. The retriever uses fast nearest neighbor search algorithms that allow it to scale to corpora containing millions of paragraphs. A gated recurrent unit updates the query at each step conditioned on the ``state'' of the reader and the ``reformulated'' query is used to re-rank the paragraphs by the retriever. We show the efficacy of our architecture by achieving state-of-the-art results (9.5\% relative increase) on TriviaQA-unfiltered and we achieve competitive performance on other large open domain datasets such as \quasart, \searchqa, and \squado. We conduct analysis and show that iterative interaction helps in retrieving useful paragraphs from the corpus. Finally, we show that our multi-step-reasoning framework brings uniform improvements when applied to two widely used reader architectures -- Dr.QA and BiDAF\footnote{Code and pretrained models are available at}.
  • Keywords: Open domain Question Answering, Reinforcement Learning, Query reformulation
  • TL;DR: Paragraph retriever and machine reader interacts with each other via reinforcement learning to yield large improvements on open domain datasets
0 Replies