Multiple Instance Learning for Unilateral Data

Published: 01 Jan 2021, Last Modified: 01 Oct 2024PAKDD (1) 2021EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Multi-instance learning (MIL) is a popular learning paradigm rooted in real-world applications. Recent studies have achieved prominent performance with sufficient annotation data. Nevertheless, acquisition of enough labeled data is often hard and only a little or partially labeled data is available. For example, in web text mining, the concerning bags (positive) is often rare compared with the unrelated ones (negative) and unlabeled ones. This leads to a new learning scenario with little negative bags and many unlabeled bags, which we name it as unilateral data. It is a new learning problem and has received little attention. In this paper, we propose a new method called Multiple Instance Learning for Unilateral Data (MILUD) to tackle this problem. To utilize the information of bags fully, we consider statistics characters and discriminative mapping information simultaneously. The key instances of bags are determined by the distinguishability of mapped samples based on fake labels. Besides, we also employed an empirical risk minimization loss function based on the mapping results to learn the optimal classifier and analyze its generalization error bound. The experimental results show that method outperforms other existing state-of-art methods.
Loading