Recurrent Transformers for Long Document Understanding

Chuzhan Hao, Peng Zhang, Minghui Xie, Dongming Zhao

Published: 01 Jan 2023, Last Modified: 19 Feb 2025NLPCC (1) 2023EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Pre-trained models have been proved effective in natural language understanding. For long document understanding, the key challenges are long-range dependence and inference efficiency. Existing approaches, however, (i) usually cannot fully model the context structure and global semantics within a long document, (ii) and lack consistency assessment on common downstream tasks. To address these issues, we propose a novel Recurrent Transformers (RTrans) for long document understanding which can not only learn long contextual structure and relationships, but also be extended to diverse downstream tasks. Specifically, our model introduces recurrent transformer block to convey the token-level contextual information across segments and capture long-range dependence. The ranking strategy is utilized to aggregate the local and global information for final prediction. Experiments on diverse tasks that require understanding long document demonstrate superior and robust performance of RTrans and our approach achieves a better balance between effectiveness and efficiency.