Micro-cluster Structure Clustering Based on Weight-Constrained Minimum Spanning Tree

Published: 2025, Last Modified: 23 Jan 2026Data Sci. Eng. 2025EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Currently, it is a challenging problem to make clustering algorithms suitable for arbitrary distributions of data. In this paper, we propose a Micro-cluster Structure Clustering algorithm based on the Weight-constrained Minimum Spanning Tree, called MSC-WMST. Firstly, the original data is standardized and rescaled, where each feature dimension is partitioned into several intervals by a unit length. Specified regions are separated within these intervals based on a given threshold, and data is sampled in these regions. Secondly, an improved weighted-constrained minimum spanning tree is proposed to search for initial micro-clusters from the sampled data. Thirdly, the merging indicator is jointly defined by the local density of micro-clusters and the distance between micro-clusters, and the pairs of micro-clusters that satisfy the maximum merging indicator will be iteratively merged in a bottom-up hierarchical manner to obtain the final cluster structure. In addition, noisy data can be identified by analyzing the characteristics of the minimum spanning tree. Finally, the remaining samples are assigned to the cluster nearest to them. Extensive experiments were conducted on twenty-four datasets, we compared the MSC-WMST algorithm with the state-of-the-art algorithms. The experimental results demonstrate that MSC-WMST exhibits excellent performance in three evaluation metrics.
Loading