Streaming Hierarchical Clustering for Emerging New Class

Published: 2025, Last Modified: 07 Jan 2026KSEM (1) 2025EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Streaming hierarchical clustering (SHC) is fundamental for real-time analysis of streaming data, with a research focus on its adaptive clustering capabilities dealing with concept drift. Traditional methods like Agglomerative Hierarchical Clustering (AHC) face challenges in dynamically updating clusters when encountering previously unseen new classes, resulting in degraded clustering outcomes. We present SHCRI (Streaming Hierarchical Clustering with Root-Insert), an algorithm that employs a root-insert strategy to achieve cluster tree adaptation for emerging new class points for a data stream. This approach strategically positions subtrees of emerging new class at nodes that maximize overall tree purity. We provide a theoretical analysis that guarantees SHCRI’s superior performance in handling emerging new class compared to existing methods. Experimental results validate that SHCRI has superior performance than existing streaming AHC algorithms, consistently maintaining higher dendrogram purity during the periods of emerging new class.
Loading