Abstract: We consider the spherical k-means problem (SKMP), a generalization of the k-means clustering problem (KMP). Given a data set of n points \({\mathcal {P}}\) in d-dimensional unit sphere \({\mathbb {R}}^d\), and an integer \(k \le n\), it aims to partition the data set \({\mathcal {P}}\) into k sets so as to minimize the sum of cosine dissimilarity measure from each data point to its closest center. We present a constant expected approximation guarantee for this problem based on integrating the k-means++ seeding algorithm for the KMP and the local search technique.
Loading