TL;DR: We propose a deep streaming view clustering method, which aims to mitigate the effect of concept drift between streaming views on clustering performance and achieve superior performance.
Abstract: Existing deep multi-view clustering methods have demonstrated excellent performance, which addressing issues such as missing views and view noise. But almost all existing methods are within a static framework, which assumes that all views have already been collected. However, in practical scenarios, new views are continuously collected over time, which forms the stream of views. Additionally, there exists the data imbalance of quality and distribution between different view streams, i.e., concept drift problem. To this end, we propose a novel Deep Streaming View Clustering (DSVC) method, which mitigates the impact of concept drift on streaming view clustering. Specifically, DSVC consists of a knowledge base and three core modules. Through the knowledge aggregation learning module, DSVC extracts representative features and prototype knowledge from the new view. Subsequently, the distribution consistency learning module aligns the prototype knowledge from the current view with the historical knowledge distribution to mitigate the impact of concept drift. Then, the knowledge guidance learning module leverages the prototype knowledge to guide the data distribution and enhance the clustering structure. Finally, the prototype knowledge from the current view is updated in the knowledge base to guide the learning of subsequent views. Extensive experiments demonstrate that, even in dynamic environments, the clustering performance of DSVC outperforms 12 state-of-the-art DMVC methods under static frameworks.
Lay Summary: Existing deep multi-view clustering methods typically assume that all view data is fully collected before training. In contrast, we aim to achieve clustering in dynamic scenarios, where multi-view data are continuously collected over time and need to be processed in a timely manner. However, due to the distributional discrepancies among different views, it is essential to address and mitigate such inter-view distribution shifts to ensure an effective multi-view cluster.
We first design a historical knowledge base to store prototype knowledge extracted from the previous view. Subsequently, we introduce a knowledge extraction module that derives representative prototype knowledge from the current view to capture its underlying distribution. Finally, we align the prototype knowledge extracted from the current view with that stored in the historical knowledge base. The aligned prototypes are then employed to guide the distribution of samples in the current view. Through this process, each collected view exhibits distributional consistency, and each sample preserves intra-class commonality while maintaining inter-class diversity.
Our study achieves streaming view clustering in dynamic environments. Experimental results demonstrate the effectiveness of the proposed method, which highlights its significance for advancing multi-view clustering research.
Primary Area: General Machine Learning->Clustering
Keywords: Streaming view clustering, Multi-view clustering, Concept drift problem, Distributed consistency learning
Submission Number: 7086
Loading