Improving the detection of noisy labels in image datasets using modified Confidence Learning

Adam Popowicz, Krystian Radlak, Slawomir Lasota, Karolina Szczepankiewicz, Michal Szczepankiewicz

2022 (modified: 03 Nov 2022)MMAR 2022Readers: Everyone

Abstract: The effectiveness of machine learning algorithms, including deep neural networks (DNN) for classifying image data, depends on proper preparation of the training dataset. Erroneously labeled images in the training data will degrade algorithmic efficiency and cause unpredictable model behavior, thus reduce its safety. Verifying labels in the numerous available databases remains a complicated and laborious task. In this article, we present a MultiNET approach that allows for efficient verification of labeled image datasets. We adapt a state-of-the-art technique, namely Confidence Learning, extending its flexibility and improving the effectiveness by combining outcomes from various DNN architectures. Thanks to the proposed modification, it is possible to automatically detect incorrect labels while minimizing the number of false positives, thus making the verification process much less burdensome. The technique may be of use for researchers and software engineers dealing with externally supplied image datasets.

0 Replies