Keywords: Manifold clustering, large scale clustering
TL;DR: We provide a model for manifold clustering with theoretical guarantees
Abstract: Manifold clustering is an important problem in motion and video segmentation, natural image clustering, and other applications where high-dimensional data lie on multiple, low-dimensional, nonlinear manifolds. While current state-of-the-art methods on large-scale datasets such as CIFAR provide good empirical performance, they do not have any proof of theoretical correctness. In this work, we propose a method that clusters data belonging to a union of nonlinear manifolds. Furthermore, for a given input data sample $y$ belonging to the $l$th manifold $\mathcal{M}_l$, we provide geometric conditions that guarantee a manifold-preserving representation of $y$ can be recovered from the solution to the proposed model. The geometric conditions require that (i) $\mathcal{M}_l$ is well-sampled in the neighborhood of $y$, with the sampling density given as a function of the curvature, and (ii) $\mathcal{M}_l$ is sufficiently separated from the other manifolds. In addition to providing proof of correctness in this setting, a numerical comparison with state-of-the-art methods on CIFAR datasets shows that our method performs competitively although marginally worse than methods without
Supplementary Material: zip
Primary Area: Machine learning for other sciences and fields
Submission Number: 17809
Loading