An online ensemble classification algorithm for multi-class imbalanced data stream

Published: 01 Jan 2024, Last Modified: 19 Feb 2025Knowl. Inf. Syst. 2024EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: In real-world scenarios, data streams frequently exhibit multiple classes, often characterized by significant imbalances in instance numbers across different classes, such as in network intrusion detection and fault diagnosis domains. Classifiers typically display a bias towards the majority class, neglecting valuable information within instances from the minority class. Simultaneously, with the continuous arrival of the data stream, the class imbalance ratio may vary. Furthermore, there is a prevalent phenomenon of concept drift in data streams, which may combine with changes in class proportions, thereby increasing the difficulty of learning from the data stream. To address these issues, an adaptive online bagging (AdaOB) ensemble classification algorithm is proposed. Firstly, a self-adaptive sampling strategy based on class proportions and classification performance is introduced to dynamically increase the exposure rate of minority class instances. Secondly, combining two ensemble update strategies is employed to mitigate the performance loss of the ensemble model during concept drift and facilitate rapid recovery thereafter. Finally, a time decay-based weighted ensemble strategy is proposed, effectively allocating weights among the integrated base classifiers. The experimental results indicate that the AdaOB algorithm performs well on various types of imbalanced data streams and outperforms state-of-the-art algorithms.
Loading

OpenReview is a long-term project to advance science through improved peer review with legal nonprofit status. We gratefully acknowledge the support of the OpenReview Sponsors. © 2025 OpenReview