Abstract: Density-based clustering algorithms like DBSCAN are highly effective but sensitive to parameter selection, particularly the neighborhood radius (ε) and the minimum number of neighboring points to form a cluster (minPts). We analyze and investigate the influence of the parameter settings onto the clustering outcome under the lense of persistent homology, a technique from topological data analysis. Persistent homology analyzes topological features, such as connected components and loops, across multiple spatial scales, improving clustering accuracy and robustness. We use the density-connectivity distance, a recent finding in the field, to allow full automatization of our approach. In extensive experiments, we demonstrate how insights from persistent homology can help to identify optimal parameter values and introduce an approach to automate parameter selection for density-based clustering. The proposed technique allows DBSCAN and related algorithms to perform effectively on a large variety of datasets without any user input. It combines topological insights with clustering techniques to provide a foundation for robust, automated approaches to complex data analysis.
External IDs:doi:10.3233/faia251192
Loading