Abstract: DBSCAN is a widely used clustering algorithm based on density metrics that can efficiently identify clusters with uniform density. However, if the densities of different clusters are varying, the corresponding clustering results may be not good. To address this issue, we propose a multi-density DBSCAN based on the relative density (MDBSCAN), which can achieve better results on clusters with multiple densities. The intuition of our work is simple but effective, we first divide the dataset into two parts: low density and high density, and then we take a divide and conquer method to deal with the respective parts to avoid them interfering with each other. Specifically, the proposed MDBSCAN consists of three steps: (i) extract the low-density data points in the dataset by relative density. (ii) find natural clusters among the identified low-density data points. (iii) clustering the remaining data points (except the data points of natural clusters in a dataset) by using DBSCAN and assigning the noises (generated by DBSCAN) to the nearest clusters. To verify the effectiveness of the proposed MDBSCAN algorithm, we conduct experiments on ten synthetic datasets and six real-world datasets. Experimental results demonstrate that the proposed MDBSCAN algorithm outperforms the original DBSCAN and six extends of DBSCAN, especially including two state-of-the-art algorithms (DRL-DBSCAN and AMD-DBSCAN) in most cases.
Loading