Abstract: Concept drift, characterized by changes in data distribution over time, has always been an inevitable problem in nonstationary data stream environments. Multistream scenarios are particularly complex due to the potential alteration of interstream correlations, posing significant challenges in addressing concept drift across multiple streams. Most existing adaptation methods target single-stream data, with limited research on multistream. To address these gaps, we propose a Continuous Graph Learning-based self-adaptation framework for Multistream concept drift, termed as CGLM. Our framework introduces a novel graph neural network (GNN) structure embedded with a dynamic graph generator (AGG). This generator creates an adaptive correlation graph using small-scale historical data, capturing spatio-temporal dependencies among streams without predefined graphs during the training phase. A base prediction GNN model is then initialized. When online testing starts, real-time performance is monitored to detect concept drift. Self-adaptation process is achieved by subgraph updating, with different continuous graph learning mechanisms are applied to nondrift or drift scenarios. Lightweight adjustment of subgraphs is performed under nondrift. When drift occurs, AGG generates a new dynamic graph based on newly arriving samples. Our adaptive diffusion graph attention module (ADGAT) captures local correlation changes caused by the drift in the newly generated dynamic graph. It adaptively updates the weights of the original correlation graph based on the extent of the drift. Experimental results on three large-scale real-world datasets demonstrate the superiority of our method over all baseline methods. Additionally, when large-scale data is available for training, our proposed CGLM still surpasses baseline methods.
External IDs:doi:10.1109/tcyb.2025.3569816
Loading