Active Label Correction

Published: 2012, Last Modified: 01 Dec 2025ICDM 2012EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Active Label Correction (ALC) is an interactive method that cleans an established training set of mislabeled examples in conjunction with a domain expert. ALC presumes that the expert who conducts this review is either more accurate than the original annotator or has access to additional resources that ensure a high quality label. A high-cost re-review is possible because ALC proceeds iteratively, scoring the full training set but selecting only small batches of examples that are likely mislabeled. The expert reviews each batch and corrects any mislabeled examples, after which the classifier is retrained and the process repeats until the expert terminates it. We compare several instantiations of ALC to fully-automated methods that attempt to discard or correct label noise in a single pass. Our empirical results show that ALC outperforms single-pass methods in terms of selection efficiency and classifier accuracy. We evaluate the best ALC instantiation on our motivating task of detecting mislabeled and poorly formulated sites within a land cover classification training set from the geography domain.
Loading