Transformer-XH: Multi-hop question answering with eXtra Hop attention

Sep 25, 2019 Blind Submission readers: everyone Show Bibtex
  • Keywords: Transformer-XH, multi-hop QA, extra hop attention, structured modeling
  • TL;DR: We present Transformer-XH, which upgrades Transformer with eXtra Hop attentions to intrinsically model structured texts in a data driven way. It leads to a simpler yet state-of-the-art multi-hop QA system.
  • Abstract: Transformers have obtained significant success modeling natural language as a sequence of text tokens. However, in many real world scenarios, textual data inherently exhibits structures beyond a linear sequence such as tree and graph; an important one being multi-hop question answering, where evidence required to answer questions are scattered across multiple related documents. This paper presents Transformer-XH, which uses eXtra Hop attention to enable the intrinsic modeling of structured texts in a fully data-driven way. Its new attention mechanism naturally “hops” across the connected text sequences in addition to attending over tokens within each sequence. Thus, Transformer-XH better answers multi-hop questions by propagating information between multiple documents, constructing global contextualized representations, and jointly reasoning over multiple pieces of evidence. This leads to a simpler multi-hop QA system which outperforms previous state-of-the-art on the HotpotQA FullWiki setting by large margins.
  • Code:
  • Original Pdf:  pdf
0 Replies