Out-of-Distribution Detection with Attention Head Masking for Multi-modal Document Classification

Out-of-Distribution Detection with Attention Head Masking for Multi-modal Document Classification

ACL ARR 2024 August Submission419 Authors

16 Aug 2024 (modified: 03 Sept 2024)ACL ARR 2024 August SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Abstract: Detecting out-of-distribution (OOD) data is cru- cial in machine learning applications to miti- gate the risk of model overconfidence, thereby enhancing the reliability and safety of deployed systems. The majority of existing OOD detec- tion methods predominantly address uni-modal inputs, such as images or texts. In the context of multi-modal documents, there is a notable lack of extensive research on the performance of these methods, which have primarily been de- veloped with a focus on computer vision tasks. We propose a novel methodology termed as at- tention head masking (AHM) for multi-modal OOD tasks in document classification systems. Our empirical results demonstrate that the pro- posed AHM method outperforms all state-of- the-art approaches and significantly decreases the false positive rate (FPR) compared to ex- isting solutions up to 7.5%. This methodology generalizes well to multi-modal data, such as documents, where visual and textual informa- tion are modeled under the same Transformer architecture. To address the scarcity of high- quality publicly available document datasets and encourage further research on OOD detec- tion for documents, we introduce, FinanceDocs, a new document AI dataset. Our code and dataset are publicly available.

Paper Type: Short

Research Area: Resources and Evaluation

Research Area Keywords: Out of Distribution Detection, Multi-modal Document Classification, Supervised Learning, Transformer models, Attention Methods

Contribution Types: NLP engineering experiment, Data resources

Languages Studied: English

Submission Number: 419

Loading