Abstract: An average person usually contains more than 10 trillion human cells, and each cell's transcriptome can be profiled by a single-cell RNA sequence (scRNA-seq). The huge volume and high complexity of scRNA-seq data put challenges on fundamental tasks of scRNA-seq data analysis, i.e., cell type identification and new cell type discovery. In this paper, we demonstrate scRAG, which can efficiently remove batch effect in cell-type identification and enable reliable new cell discovery, facilitated by GPU-based scRNA-seq data management and Large Language Models (LLMs). The GPU-based scRNA-seq data management enables high throughput scRNA-seq data retrieval and update, while the LLM utilizes the retrieval results to remove the batch effect and discover novel cells. We demonstrate scRAG for its: (a) interfaces, (b) GPU-based scRNA-seq data management, and (c) applications in batch effect removal and cancer cell discovery.
External IDs:dblp:conf/icde/MaoZLLYG25
Loading