ADAPTING PRETRAINED LANGUAGE MODELS FOR LONG DOCUMENT CLASSIFICATION

Matthew Lyle Olson; Lisa Zhang; Chun-Nam Yu

ADAPTING PRETRAINED LANGUAGE MODELS FOR LONG DOCUMENT CLASSIFICATION

Matthew Lyle Olson, Lisa Zhang, Chun-Nam Yu

25 Sept 2019 (modified: 05 May 2023)ICLR 2020 Conference Blind SubmissionReaders: Everyone

TL;DR: We acheive state of the art results on long document classication by combining pretrained language models representations with attention.

Abstract: Pretrained language models (LMs) have shown excellent results in achieving human like performance on many language tasks. However, the most powerful LMs have one significant drawback: a fixed-sized input. With this constraint, these LMs are unable to utilize the full input of long documents. In this paper, we introduce a new framework to handle documents of arbitrary lengths. We investigate the addition of a recurrent mechanism to extend the input size and utilizing attention to identify the most discriminating segment of the input. We perform extensive validating experiments on patent and Arxiv datasets, both of which have long text. We demonstrate our method significantly outperforms state-of-the-art results reported in recent literature.

Code: https://github.com/cf-anonymous/long_doc

Keywords: NLP, Deep Learning, Language Models, Long Document

Original Pdf: pdf

7 Replies

Loading