Abstract: Multi-label classification deals with the problem where each instance is associated with multiple labels. To discriminate the label difference, each label can be modeled in its specific feature subset derived from the original feature space. In these label-specific methods, the mainstream is to generate new features by analyzing the distance relationship between data points and the clusters they aggregate into. However, it is difficult to determine how many clusters are required, and clustering algorithms are often unstable. In this paper, we take entropy to measure clustering quality and establish a novel model to quantitatively determine the number of clusters. Besides, a novel conception of entropy similarity is proposed to pairwise measure label correlation and enable clustering ensemble to improve model robustness. Experiments on 12 benchmark datasets validate the effectiveness of the proposed method.
Loading