Abstract: In real-world scenarios, data streams frequently exhibit multiple classes, often characterized by significant imbalances in instance numbers across different classes, such as in network intrusion detection and fault diagnosis domains. Classifiers typically display a bias towards the majority class, neglecting valuable information within instances from the minority class. Simultaneously, with the continuous arrival of the data stream, the class imbalance ratio may vary. Furthermore, there is a prevalent phenomenon of concept drift in data streams, which may combine with changes in class proportions, thereby increasing the difficulty of learning from the data stream. To address these issues, an adaptive online bagging (AdaOB) ensemble classification algorithm is proposed. Firstly, a self-adaptive sampling strategy based on class proportions and classification performance is introduced to dynamically increase the exposure rate of minority class instances. Secondly, combining two ensemble update strategies is employed to mitigate the performance loss of the ensemble model during concept drift and facilitate rapid recovery thereafter. Finally, a time decay-based weighted ensemble strategy is proposed, effectively allocating weights among the integrated base classifiers. The experimental results indicate that the AdaOB algorithm performs well on various types of imbalanced data streams and outperforms state-of-the-art algorithms.
Loading