Abstract: The distribution of the labeled data can greatly affect the performance of a semisupervised learning (SSL) model. Most existing SSL models select the labeled data randomly and equally allocate the labeling quota among the classes, leading to considerable unstableness and degeneration of performance. This study unsupervisedly constructs a leading forest that forms another metric space, based on which it is convenient to define the fuzzy membership function to characterize central and divergent samples and select both types with fuzzy Xor logic. The labeling quota can, thus, be allocated adaptively among different classes. The proposed determinate labeling strategy can generally improve the performance for most SSLs. Especially, when combined with the kernelized large margin component analysis, it produces a novel semisupervised classification model. In addition, the multimodal issue in SSL is effectively addressed by the multigranular structure of leading forest that readily facilitates multiple local metrics learning. Extensive experimental results demonstrate that the proposed method achieved competitive efficiency and encouraging accuracy when compared with the state-of-the-art methods.
Loading