On Evaluation and Improvement of Tail Label Performance for Multi-label Text ClassificationDownload PDF


16 Jun 2021 (modified: 05 May 2023)ACL ARR 2021 Jun Blind SubmissionReaders: Everyone
Abstract: Extreme multi-label text classification (XMTC) is a task for tagging each document with the most relevant subset of labels from an extremely large label set. The most challenging part for machine learning methods is the skewed label distribution in which a majority of labels receive very few training instances (named as the tail labels). Benchmark evaluations so far have focused on micro-averaging metrics, where the performance on tail labels can be easily overshadowed by high-frequency labels (named as head labels), and hence they are insufficient for evaluating the true success of methods in XMTC. This paper presents a re-evaluation of state-of-the-art (SOTA) methods based on the binned macro-averaging F1 instead, which reveals new insights into the strengths and weaknesses of representative methods. Based on the evaluation, we conduct in-depth analysis and experiments on Transformer models with various depths and attention mechanisms to improve the tail label performance. We show that a shallow Transformer model with word-label attentions can effectively leverage word-level features and outperforms previous Transformers on tails labels.
Software: zip
0 Replies
