Retraining with Predicted Hard Labels Provably Increases Model Accuracy

Rudrajit Das; Inderjit S Dhillon; Alessandro Epasto; Adel Javanmard; Jieming Mao; Vahab Mirrokni; Sujay Sanghavi; Peilin Zhong

Retraining with Predicted Hard Labels Provably Increases Model Accuracy

Rudrajit Das, Inderjit S Dhillon, Alessandro Epasto, Adel Javanmard, Jieming Mao, Vahab Mirrokni, Sujay Sanghavi, Peilin Zhong

Published: 01 May 2025, Last Modified: 18 Jun 2025ICML 2025 posterEveryoneRevisionsBibTeXCC BY 4.0

TL;DR: We theoretically show that retraining a model with its predicted hard labels can improve its accuracy when the given labels are noisy, and empirically demonstrate that retraining significantly improves label DP training at no extra privacy cost.

Abstract: The performance of a model trained with noisy labels is often improved by simply *retraining* the model with its *own predicted hard labels* (i.e., $1$/$0$ labels). Yet, a detailed theoretical characterization of this phenomenon is lacking. In this paper, we theoretically analyze retraining in a linearly separable binary classification setting with randomly corrupted labels given to us and prove that retraining can improve the population accuracy obtained by initially training with the given (noisy) labels. To the best of our knowledge, this is the first such theoretical result. Retraining finds application in improving training with local label differential privacy (DP), which involves training with noisy labels. We empirically show that retraining selectively on the samples for which the predicted label matches the given label significantly improves label DP training at no extra privacy cost; we call this consensus-based retraining. For example, when training ResNet-18 on CIFAR-100 with $\epsilon=3$ label DP, we obtain more than $6$% improvement in accuracy with consensus-based retraining.

Lay Summary: Training machine learning (ML) models with incorrect or noisy supervision (i.e., labels) is a common challenge in the real world. Surprisingly, simply retraining a model using its own predicted labels often improves its performance -- even though those predictions come from the same model initially trained on bad data. Despite the practical success of this trick, a solid mathematical understanding of how/when/why it works has been missing. We theoretically analyze model retraining for a binary (two-class) classification problem where the given labels are corrupted, and characterize the conditions under which retraining can improve the model's performance. We also explore how this idea helps in label differential privacy (DP), a private machine learning technique wherein the privacy of the training labels is protected by deliberately adding label noise. We propose consensus-based retraining, a method that only uses those examples for which the model's prediction matches the given label. We empirically show that consensus-based retraining leads to significant performance gains. Ultimately, our paper offers theoretical insight and practical value for building better ML models under noisy supervision with the simple idea of retraining.

Primary Area: Theory->Learning Theory

Keywords: Retraining, Predicted Labels, Hard Labels, Label Noise, Label DP

Submission Number: 1980

Loading