A Nonparametric Split and Kernel-Merge Clustering Algorithm

Published: 01 Jan 2024, Last Modified: 19 Feb 2025IEEE Trans. Artif. Intell. 2024EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: This work proposes a novel split and kernel-merge clustering (S-KMC), a nonparametric clustering algorithm that combines the strengths of hierarchical clustering, partitional clustering, and density-based clustering. It consists of two main phases: splitting and merging. In the splitting phase, a ranking-based operator is used to divide the data into optimal subclusters. In the merging phase, a kernel function estimates the density of these subclusters after projecting them onto a straight line passing through their centers, facilitating the merging operation. S-KMC is fully nonparametric, eliminating the need for prior information about the data. It effectively handles 1) shape diversity, 2) density variability, 3) high dimensionality, 4) outliers, and 5) missing values. The algorithm offers easily tunable hyperparameters, enhancing its applicability to complex problems and robustness against data anomalies. Experimental analysis on 21 benchmark datasets demonstrates the improved performance of S-KMC in terms of cluster accuracy, handling high-dimensional data, and managing data anomalies and outliers. Comprehensive comparisons with state-of-the-art techniques further validate the superior or comparable performance of the proposed S-KMC algorithm.
Loading