Online Learning from Mix-typed, Drifted, and Incomplete Streaming Features

Shengda Zhuo, Di Wu, Yi He, Shuqiang Huang, Xindong Wu

Published: 30 Sept 2025, Last Modified: 21 Nov 2025ACM Transactions on Knowledge Discovery from DataEveryoneRevisionsCC BY-SA 4.0
Abstract: Online learning, where feature spaces can change over time, offers a flexible learning paradigm that has attracted considerable attention. However, it still faces three significant challenges. First, the heterogeneity of real-world data streams with mixed feature types presents challenges for traditional parametric modeling. Second, data stream distributions can shift over time, causing an abrupt and substantial decline in model performance. Additionally, the time and cost constraints make it infeasible to label every data instance in a supervised setting. To overcome these challenges, we propose a new algorithm Online Learning from Mix-typed, Drifted, and Incomplete Streaming Features (OL-MDISF), which aims to relax restrictions on both feature types, data distribution, and supervision information. Our approach involves utilizing copula models to create a comprehensive latent space, employing an adaptive sliding window for detecting drift points to ensure model stability, and establishing label proximity information based on geometric structural relationships. To demonstrate the model’s efficiency and effectiveness, we provide theoretical analysis and comprehensive experimental results.
External IDs:doi:10.1145/3744712
Loading