Effectively Clustering Single Cell RNA Sequencing Data by Sparse Representation

Published: 2022, Last Modified: 15 Jan 2026IEEE ACM Trans. Comput. Biol. Bioinform. 2022EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Clustering analysis has been widely used in analyzing single-cell RNA-sequencing (scRNA-seq) data to study various biological problems at cellular level. Although a number of scRNA-seq data clustering methods have been developed, most of them evaluate the similarity of pairwise cells while ignoring the global relationships among cells, which sometimes cannot effectively capture the latent structure of cells. In this paper, we propose a new clustering method SPARC for scRNA-seq data. The most important feature of SPARC is a novel similarity metric that uses the sparse representation coefficients of each cell in terms of the other cells to measure the relationships among cells. In addition, we develop an outlier detection method to help parameter selection in SPARC. We compare SPARC with nine existing scRNA-seq data clustering methods on twelve real datasets. Experimental results show that SPARC achieves the state of the art performance. By further analyzing the cell similarity data derived from sparse representations, we find that SPARC is much more effective in mining high quality clusters of scRNA-seq data than two traditional similarity metrics. In conclusion, this study provides a new way to effectively cluster scRNA-seq data and achieves more accurate clustering results than the state of art methods.
Loading