Abstract: Under-sampling extensions of bagging are currently the most accurate ensembles specialized for class imbalanced data. Nevertheless, since improvements of recognition of the minority class, in this type of ensembles, are usually associated with a decrease of recognition of majority classes, we introduce a new, two phase, ensemble called Actively Balanced Bagging. The proposal is to first learn a bagging classifier and then iteratively improve it by updating its bootstraps with a limited number learning examples. The examples are selected according to an active learning strategy, which takes into account: decision margin of votes, example class distribution in the training set and/or in its neighbourhood, and prediction errors of component classifiers. Experiments with synthetic and real-world data confirm usefulness of this proposal.
Loading