Abstract: Labeling each instance in a large-scale data set is extremely labor- and time-consuming. One way to alleviate this problem is active learning, which aims to discover the most valuable instances for labeling to construct a powerful classifier with low generalization error. Considering both informativeness and representativeness provides a promising way to design a practical active learning. However, most existing active learning methods select instances favoring either informativeness or representativeness. Meanwhile, many are designed based on the binary class, so that they may present suboptimal solutions on the data sets with multiple classes. In this paper, a hybrid informative and representative criterion based multi-class active learning approach is proposed. We combine the informativeness and representativeness into one formulation, which can be solved under a unified framework. The informativeness is measured by the margin minimum while the representative information is measured by the maximum mean discrepancy. By minimizing the loss risk, we generalize the loss risk minimization principle to the multi-class active learning setting. Hence, the proposed method is not only suitable to the binary class but also the multiple classes. We conduct our experiments on twelve benchmark UCI data sets, and the experimental results demonstrate that the proposed method performs better than some state-of-the-art methods.
0 Replies
Loading