Learning Label Refinement and Thresholds for Imbalanced Semi-Supervised Learning

18 Sept 2023 (modified: 11 Feb 2024)Submitted to ICLR 2024EveryoneRevisionsBibTeX
Primary Area: unsupervised, self-supervised, semi-supervised, and supervised representation learning
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Keywords: semi-supervised learning, class imbalance
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.
TL;DR: Using a dataset subset to learn logit adjustment and thresholding can enhance imbalanced semi-supervised algorithms.
Abstract: Semi-supervised learning (SSL) has proven to be effective in enhancing generalization when working with limited labeled training data. Existing SSL algorithms based on pseudo-labels rely on heuristic strategies or uncalibrated model confidence and are unreliable when imbalanced class distributions bias pseudo-labels. In this paper, we introduce SEmi-supervised learning with pseudo-label optimization based on VALidation data (SEVAL) to reduce the class bias and enhance the quality of pseudo-labelling for imbalanced SSL. First, we develop a curriculum for adjusting logits, improving the accuracy of the pseudo-labels generated by biased models. Second, we establish a curriculum for class-specific thresholds, ensuring the correctness of pseudo-labels on a per-class basis. SEVAL adapts to specific tasks by learning refinement and thresholding parameters from a partition of the training dataset in a class balanced way. Our experimental findings show that SEVAL surpasses current methods based on pseudo-label refinement and threshold adjustment, delivering more accurate and effective pseudo-labels in various imbalanced SSL situations. Owing to its simplicity and flexibility, SEVAL can readily be incorporated to boost the efficacy of numerous other SSL techniques.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.
Supplementary Material: zip
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 1062
Loading