SoftMatch: Addressing the Quantity-Quality Tradeoff in Semi-supervised Learning

Hao Chen; Ran Tao; Yue Fan; Yidong Wang; Jindong Wang; Bernt Schiele; Xing Xie; Bhiksha Raj; Marios Savvides

SoftMatch: Addressing the Quantity-Quality Tradeoff in Semi-supervised Learning

Hao Chen, Ran Tao, Yue Fan, Yidong Wang, Jindong Wang, Bernt Schiele, Xing Xie, Bhiksha Raj, Marios Savvides

Published: 01 Feb 2023, Last Modified: 04 Aug 2025ICLR 2023 posterReaders: Everyone

Keywords: Semi-Supervised Learning, Semi-Supervised Classification

TL;DR: This paper revisit the quantity-quality tradeoff with a unified sample weighting function of pseudo-labeling/consistency loss. From the analysis, we propose SoftMatch, which better utilizes unlabeled data while reducing the enrolled error rate.

Abstract: The critical challenge of Semi-Supervised Learning (SSL) is how to effectively leverage the limited labeled data and massive unlabeled data to improve the model's generalization performance. In this paper, we first revisit the popular pseudo-labeling methods via a unified sample weighting formulation and demonstrate the inherent quantity-quality trade-off problem of pseudo-labeling with thresholding, which may prohibit learning. To this end, we propose SoftMatch to overcome the trade-off by maintaining both high quantity and high quality of pseudo-labels during training, effectively exploiting the unlabeled data. We derive a truncated Gaussian function to weight samples based on their confidence, which can be viewed as a soft version of the confidence threshold. We further enhance the utilization of weakly-learned classes by proposing a uniform alignment approach. In experiments, SoftMatch shows substantial improvements across a wide variety of benchmarks, including image, text, and imbalanced classification.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics

Submission Guidelines: Yes

Please Choose The Closest Area That Your Submission Falls Into: Deep Learning and representational learning

Supplementary Material: zip

Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 1 code implementation](https://www.catalyzex.com/paper/softmatch-addressing-the-quantity-quality/code)

13 Replies

Loading