ABACUS: Mining Arbitrary Shaped Clusters from Large Datasets based on Backbone IdentificationOpen Website

Published: 2011, Last Modified: 12 May 2023SDM 2011Readers: Everyone
Abstract: A wide variety of clustering algorithms exist that cater to applications based on certain special characteristics of the data. Our focus is on methods that capture arbitrary shaped clusters in data, the so called spatial clustering algorithms. With the growing size of spatial datasets from diverse sources, the need for scalable algorithms is paramount. We propose a shape-based clustering algorithm, ABACUS, that scales to large datasets. ABACUS is based on the idea of identifying the intrinsic structure for each cluster, which we also refer to as the backbone of that cluster. The backbone comprises of a much smaller set of points, thus giving this method the desired ability to scale to larger datasets. ABACUS operates in two stages. In the first stage, we identify the backbone of each cluster via an iterative process made up of globbing (or point merging) and point movement operations. The backbone enables easy identification of the true clusters in a subsequent stage. Experiments on a range of real (images from geospatial satellites, etc.) and synthetic datasets demonstrate the efficiency and effectiveness of our approach. In particular, ABACUS is over an order of magnitude faster than existing shape-based clustering methods, yet it provides a comparable or better clustering quality.
0 Replies

Loading