Combining Self-labeling with Selective Sampling

Published: 01 Jan 2023, Last Modified: 08 Mar 2025ICDM (Workshops) 2023EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Access to labeled data is generally expensive, so semi-supervised methods are constantly popular. They enable the acquisition of large datasets without needing too many expert labels. This work combines self-labeling techniques with active learning in a selective sampling scenario by proposing how to build a classifier ensemble. When training the base classifiers, a decision is made on whether to request a new label or use the self-labeling based on evaluating the decision inconsistency of base classifiers. Additionally, a technique inspired by online bagging was used to ensure the ensemble’s diversity, whereby individual learning examples are presented to base classifiers at different intensities.The preliminary studies showed that naïve application of self-labeling could harm performance by introducing bias towards selected classes and consequently lead to skewed class distribution. Hence, we propose how to reduce this phenomenon. Experimental evaluation confirmed that the proposed method performs well compared to the known selective sampling methods.
Loading