Learning from Aggregate-Masked Labels

Learning from Aggregate-Masked Labels

ICLR 2026 Conference Submission16202 Authors

19 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Weakly supervised learning; Privacy labels; Aggregate labels

Abstract: With the increasing concern over data privacy, more researchers are focusing on protecting sensitive labels using aggregate observations, such as similarity labels and label proportions. Unfortunately, these methods weaken the supervisory information of insensitive labels, thereby reducing the performance of existing classifiers. To address this issue, we propose a novel setting called Aggregate-Masked Labels, whose primary advantage lies in introducing augmented supervision to maintain partially full supervision and protecting sensitive labels. Specifically, for aggregate observations that contain sensitive labels, we use these sensitive labels as the aggregate-masked labels. In contrast, for aggregate observations without sensitive labels, we assign the ground-truth label to each instance, as shown in Figure 1. Moreover, we introduce the risk-consistent estimator that effectively leverages aggregate-masked labels to train a multi-class classifier. We further introduce stochastic label combinations to alleviate the high computational cost, effectively accelerating the training process. Experimental results on both real-world and benchmark datasets demonstrate that our method achieves state-of-the-art classification performance.

Primary Area: unsupervised, self-supervised, semi-supervised, and supervised representation learning

Submission Number: 16202

Loading