Finding the Needle in a Haystack: Unsupervised Rationale Extraction from Long Document ClassifiersDownload PDF

Anonymous

17 Apr 2023ACL ARR 2023 April Blind SubmissionReaders: Everyone
Abstract: Long-sequence models are designed to better represent longer texts and improve performance on document-level tasks. Advancing from individual sentences to using much longer context, rationale extraction from these models is becoming increasingly important, in order to analyse model behaviour and provide finer-grained informative predictions. This paper investigates methods of unsupervised rationale extraction for long-sequence models in the context of document classification. We find that previously proposed methods for sentence classification do not perform well when applied on long documents, due to very limited tokens being updated during training. We alleviate this issue by introducing a Ranked Soft Attention architecture that ensures more tokens receive appropriate weak supervision. We also investigate a Compositional Soft Attention architecture that applies RoBERTa sentence-wise to extract plausible rationales at the token-level. The proposed methods significantly outperform Longformer-driven baselines on sentiment classification datasets, while also exhibiting significantly lower runtimes.
Paper Type: short
Research Area: Machine Learning for NLP
0 Replies

Loading