Classification Method Utilizing Reliably Labeled Data

Published: 2008, Last Modified: 06 Feb 2025KES (1) 2008EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Making an accurate classifier needs accurate labeling, and accurate labeling needs accurate domain knowledge, experience and criteria, that is, experts to label. In reality, having such experts label all data that we need is often impossible because it requires of the high cost, and sometimes we have to make use of ’cheaper’ data labeled by non-experts. In such case, experts’ and non-experts’ data are not discriminated in learning, even if mislabeled data in non-experts’ data may make the resultant classifier poor. In this paper, we propose a classification method utilizing reliably labeled data. We utilize the previous knowledge of how reliable persons have given the labels, and set the degrees of label confidence on non-experts’ data based on neighboring reliable experts data. The degrees of confidence are reflected in learning as data with higher confidence make a greater contribution to the classifier. With these assumptions, the results of experiments with publicly available data suggest that our method can make a more precise classifier than the conventional method that adopts all data equally.
Loading