Abstract: We formulate probabilistic clustering method based on a sequence of random swaps of cluster centroids. We show that the algorithm has linear dependency on the number of data vectors, quadratic on the number of clusters, and inverse dependency on the dimensionality. Each halving of the probability of failure (e.g. from 1% to 0.5%) is achieved at the cost of only linear increase in the processing time.
0 Replies
Loading