OT-Class: Optimal Transport-Enhanced Multi-label Text Classification

ACL ARR 2024 June Submission3508 Authors

16 Jun 2024 (modified: 02 Jul 2024)ACL ARR 2024 June SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Abstract: Multi-label text classification (MLTC) aims to assign at least one label from a vast label space to a document. This task is challenging due to the large number of labels, which can range from hundreds to thousands, and the potential interdependence of labels. While previous efforts have achieved success in fully-supervised settings, they have limited performance in more practical weakly-supervised settings. Despite its potential benefits, an auxiliary task of word-to-label alignment that aligns words in the input text to the large label space has been largely overlooked in existing work. Word-to-label alignment is significant, as it provides valuable insights into how words contribute to the overall classification of a document. However, existing MLTC datasets lack ground truth labels for word-to-label alignment for supervised training. To address this limitation, we propose a novel framework called OT-Class, which incorporates unsupervised word-to-label alignment into MLTC using optimal transport (OT). Our framework tackles MLTC in a multi-task setting, comprising a primary task that classifies documents using a standard text classification algorithm and an auxiliary task that identifies corresponding labels for all input document words via optimal transport. Our experiments demonstrate that OT-Class outperforms baselines that do not utilize word-to-label alignment, highlighting its effectiveness. A detailed analysis reveals that OT-Class has an amplified advantage in fine-grained label spaces and appropriately influences predictions through word-to-label alignment.
Paper Type: Short
Research Area: Information Retrieval and Text Mining
Research Area Keywords: text classification, optimal transport, alignment, multi-task
Contribution Types: Model analysis & interpretability, Publicly available software and/or pre-trained models, Data analysis
Languages Studied: English
Submission Number: 3508
Loading