Finding Stable Clustering for Noisy Data via Structure-Aware RepresentationDownload PDFOpen Website

2019 (modified: 07 Sept 2021)IEEE BigData 2019Readers: Everyone
Abstract: Clustering is one of the most prominent topics in machine learning. A multitude of clustering methods have been proposed, among which the spectral clustering has attracted much attention. However, in practice, spectral clustering is highly sensitive to noise data and a post-processing step (e.g., k-means for eigenvectors) is often required to obtain clustering indicators, which may be not optimal. Also, it does not scale well to large-scale data due to its eigen-decomposition procedures.Here we propose a structure-aware clustering model to address those issues. To achieve our goal, a high-quality affinity matrix is extracted from the original noisy data by a sparse additive decomposition, which is used to approximate the ideal clustering structure. We then jointly learn the high-quality affinity matrix as well as the spectral embedding in a unified model- thus, being robust to noise and obtaining the optimal clustering indicators without any post-processing steps. We further improve the clustering stability by considering the Laplacian eigengap of the affinity matrix. We show that the larger the Laplacian eigengap, the more stable the clustering results. We introduce a speedup strategy to effectively compute eigenvectors of large matrices. Experimental results demonstrate that the proposed model outperforms existing approaches for noisy data.
0 Replies

Loading