Partial label learning with semi-supervised clustering disambiguation

Jun-Ying Liu, Jian-Ping Sun, Ya-Hong Zhao, Bin-Bin Jia, Min-Ling Zhang

Published: 01 Jan 2026, Last Modified: 03 Sept 2025Pattern Recognit. 2026EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: In partial label learning, the most challenging issue is that the ground-truth labels of training samples are concealed within their respective candidate label sets. Generally, this problem is addressed by disambiguating the candidate label set via manipulating either feature or label space. In this paper, we propose a novel partial label learning algorithm named PLOD that implements the disambiguation process as a semi-supervised clustering problem, enabling the natural utilization of both the structural information in the feature space and weakly supervised labeling information. Specifically, PLOD aims to infer the ground-truth label for each training sample using a tailor-made constrained k<math><mi is="true">k</mi></math>-means algorithm. PLOD initiates the cluster center for each class by averaging instances whose candidate label set includes that class. Subsequently, instances are assigned to the nearest class within their candidate label set. The cluster centers are then updated by averaging the instances of each class, followed by a repeated class assignment procedure. These two steps are iteratively performed until convergence. After that, a one-versus-one decomposition is employed to solve the resulting problem, treating the assigned class for each training sample as ground-truth. Finally, PLOD aggregates the related predictive outputs of the one-versus-one decomposition for each class to derive the final prediction. Experimental results on both artificial and real-world partial label data sets demonstrate the superior performance of our proposed PLOD algorithm. The code is publicly available at https://github.com/liujunying-ai/PLOD.