Transformer-XH: Multi-Evidence Reasoning with eXtra Hop Attention

Chen Zhao; Chenyan Xiong; Corby Rosset; Xia Song; Paul Bennett; Saurabh Tiwary

Transformer-XH: Multi-Evidence Reasoning with eXtra Hop Attention

Chen Zhao, Chenyan Xiong, Corby Rosset, Xia Song, Paul Bennett, Saurabh Tiwary

Published: 20 Dec 2019, Last Modified: 05 May 2023ICLR 2020 Conference Blind SubmissionReaders: Everyone

Keywords: Transformer-XH, multi-hop QA, fact verification, extra hop attention, structured modeling

TL;DR: We present Transformer-XH, which upgrades Transformer with eXtra Hop attentions to intrinsically model structured texts in a data driven way.

Abstract: Transformers have achieved new heights modeling natural language as a sequence of text tokens. However, in many real world scenarios, textual data inherently exhibits structures beyond a linear sequence such as trees and graphs; many tasks require reasoning with evidence scattered across multiple pieces of texts. This paper presents Transformer-XH, which uses eXtra Hop attention to enable intrinsic modeling of structured texts in a fully data-driven way. Its new attention mechanism naturally “hops” across the connected text sequences in addition to attending over tokens within each sequence. Thus, Transformer-XH better conducts joint multi-evidence reasoning by propagating information between documents and constructing global contextualized representations. On multi-hop question answering, Transformer-XH leads to a simpler multi-hop QA system which outperforms previous state-of-the-art on the HotpotQA FullWiki setting. On FEVER fact verification, applying Transformer-XH provides state-of-the-art accuracy and excels on claims whose verification requires multiple evidence.

Code: https://drive.google.com/file/d/1-CwjDwSvGzLKHMXNapzTin8Vw6SVYD9b/view?usp=sharing

Original Pdf: pdf

12 Replies

Loading