Positive-Unlabeled Learning in Implicit Feedback from Data Missing-Not-At-Random Perspective

Sichao Wang, Tianyu Xia, Lingxiao Yang

Published: 01 Dec 2025, Last Modified: 23 Jan 2026EntropyEveryoneRevisionsCC BY-SA 4.0
Abstract: The lack of explicit negative labels issue is a prevalent challenge in numerous domains, including CV, NLP, and Recommender Systems (RSs). To address this challenge, many negative sample completion methods are proposed, such as optimizing sample distribution through pseudo-negative sampling and confidence screening in CV, constructing reliable negative examples by leveraging textual semantics in NLP, and supplementing negative samples via sparsity analysis of user interaction behaviors and preference inference in RS for handling implicit feedback. However, most existing methods fail to adequately address the Missing-Not-At-Random (MNAR) nature of the data and the potential presence of unmeasured confounders, which compromise model robustness in practice. In this paper, we first formulate the prediction task in RS with implicit feedback as a positive-unlabeled (PU) learning problem. We then propose a two-phase debiasing framework consisting of exposure status imputation, followed by debiasing through the proposed doubly robust estimator. Moreover, our theoretical analysis shows that existing propensity-based approaches are biased in the presence of unmeasured confounders. To overcome this, we incorporate a robust deconfounding method in the debiasing phase to effectively mitigate the impact of unmeasured confounders. We conduct extensive experiments on three widely used real-world datasets to demonstrate the effectiveness and potential of the proposed methods.
Loading