Multi-step Retriever-Reader Interaction for Scalable Open-domain Question Answering

Rajarshi Das, Shehzaad Dhuliawala, Manzil Zaheer, Andrew McCallum

Sep 27, 2018 ICLR 2019 Conference Blind Submission readers: everyone Show Bibtex
  • Abstract: This paper introduces a new framework for open-domain question answering in which the retriever and the reader \emph{iteratively interact} with each other. The framework is agnostic to the architecture of the machine reading model provided it has \emph{access} to the token-level hidden representations of the reader. The retriever uses fast nearest neighbor search that allows it to scale to corpora containing millions of paragraphs. A gated recurrent unit updates the query at each step conditioned on the \emph{state} of the reader and the \emph{reformulated} query is used to re-rank the paragraphs by the retriever. We conduct analysis and show that iterative interaction helps in retrieving informative paragraphs from the corpus. Finally, we show that our multi-step-reasoning framework brings consistent improvement when applied to two widely used reader architectures (\drqa and \bidaf) on various large open-domain datasets ---\tqau, \quasart, \searchqa, and \squado\footnote{Code and pretrained models are available at \url{https://github.com/rajarshd/Multi-Step-Reasoning}}.
  • Keywords: Open domain Question Answering, Reinforcement Learning, Query reformulation
  • TL;DR: Paragraph retriever and machine reader interacts with each other via reinforcement learning to yield large improvements on open domain datasets
0 Replies

Loading