An Efficient Fuzzy Stream Clustering Method Based on Granular-Ball Structure

Published: 2024, Last Modified: 22 Jan 2026ICDE 2024EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Current data stream clustering algorithms face low efficiency in both the online and offline phases, and struggle to address the problem of cluster boundary overlap caused by concept drift. Specifically, in the online phase, the majority of existing data stream clustering algorithms require each newly arriving sample to be scanned and inserted into the appropriate micro-clusters. In offline clustering, algorithms typically require all sample points as input. Moreover, most data stream clustering algorithms struggle to effectively deal with the problem of cluster boundary overlap caused by the concept drift. To tackle these challenges, we use a granular-ball structure for the coarse-grained representation of data stream. This structure eliminates the need for computations on all data points in both the online and offline phases. Additionally, we introduce fuzziness into the granular-ball structure to resolve the issue of cluster boundary overlap caused by the concept drift. Experimental results on both synthetic and real-world datasets demonstrate that our approach achieves efficient and accurate clustering performance when compared to existing data stream clustering algorithms. Our source code is publicly available at https://github.com/xjnine/GBFuzzyStream.
Loading