Abstract: Recent years, unsupervised feature selection (UFS) has obtained widespread attention in various tasks of high-dimensional data mining. However, how to characterize the potential structural information of unlabeled data samples remains an unresolved challenging problem. Some existing UFS models explore the manifold structure and hard pseudo-labels in the high-dimensional feature space, which overlook the noisy information and the data fuzziness. In this paper, we propose an unsupervised feature selection model based on fuzzy K-Means and sparse projection (FKMSP). In particular, the model first employs fuzzy K-Means to obtain discriminative pseudo-labels for data samples that considers the fuzzy distance between data samples and cluster centroids. Then, through an regressive fitness term with \(l_{2,p}(0<p<1)\) norm constraint for the soft pseudo-labels, the sparse projection matrix is obtained which can effectively represent feature importance. A simple yet efficient iterative optimization algorithm is developed to solve the objective function, together with empirical verified convergence. Extensive experimental results on the benchmark databases demonstrate the effectiveness and superiority of the proposed FKMSP model compared with other state-of-the-art models.
Loading