Variational Continuous Label Distribution Learning for Multi-Label Text Classification

Published: 01 Jan 2024, Last Modified: 21 Jul 2025IEEE Trans. Knowl. Data Eng. 2024EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Multi-label text classification (MLTC) refers to the problem of tagging a given document with the most relevant subset of labels. One of the biggest challenges for MLTC is the existence of class imbalance. Most advanced MLTC models suffer from this issue, which limits the performance of the models. In this paper, we propose a model-agnostic framework named variational continuous label distribution learning (VCLDL) to address this problem. VCLDL theoretically builds a corresponding relationship between the feature space and the label space to mine the information hidden in the observable logical labels. Specifically, VCLDL regards label distribution as a continuous density function in latent space and forms a flexible variational approach to approximate the density function of the labels with the collaboration of the feature space. Combined with VCLDL, MLTC models can pay more attention to the distribution of the whole label set, rather than specific labels with maximum response values, thus the class imbalance problem can be well overcome. Experimental results on multiple benchmark datasets demonstrate that VCLDL can bring significant performance improvements over the existing MLTC models.
Loading