Balancing Positive and Negative Classification Error Rates in Positive-Unlabeled Learning

Ximing Li; Yuanchao Dai; Bing Wang; Changchun Li; Jianfeng Qu; Renchu Guan

Balancing Positive and Negative Classification Error Rates in Positive-Unlabeled Learning

Ximing Li, Yuanchao Dai, Bing Wang, Changchun Li, Jianfeng Qu, Renchu Guan

Published: 18 Sept 2025, Last Modified: 29 Oct 2025NeurIPS 2025 posterEveryoneRevisionsBibTeXCC BY 4.0

Keywords: positive and unlabeled learning, generalization error bound

Abstract: Positive and Unlabeled (PU) learning is a special case of binary classification with weak supervision, where only positive labeled and unlabeled data are available. Previous studies suggest several specific risk estimators of PU learning such as non-negative PU (nnPU), which are unbiased and consistent with the expected risk of supervised binary classification. In nnPU, the negative-class empirical risk is estimated by positive labeled and unlabeled data with a non-negativity constraint. However, its negative-class empirical risk estimator approaches 0, so the negative class is over-played, resulting in imbalanced error rates between positive and negative classes. To solve this problem, we suppose that the expected risks of the positive-class and negative-class should be close. Accordingly, we constrain that the negative-class empirical risk estimator is lower bounded by the positive-class empirical risk, instead of 0; and also incorporate an explicit equality constraint between them. we suggest a risk estimator of PU learning that balances positive and negative classification error rates, named $\mathrm{D{\small C-PU} }$, and suggest an efficient training method for $\mathrm{D{\small C-PU} }$ based on the augmented Lagrange multiplier framework. We theoretically analyze the estimation error of $\mathrm{D{\small C-PU} }$ and empirically validate that $\mathrm{D{\small C-PU} }$ achieves higher accuracy and converges more stable than other risk estimators of PU learning. Additionally, $\mathrm{D{\small C-PU} }$ also performs competitive accuracy performance with practical PU learning methods.

Supplementary Material: zip

Primary Area: General machine learning (supervised, unsupervised, online, active, etc.)

Submission Number: 7876

Loading