AI AGENT FOR DATA-DRIVEN HYPOTHESIS EXPLORATION IN SINGLE-CELL TRANSCRIPTOMICS

Artemy Bakulin; Pierre Boyeau; Nir Yosef

AI AGENT FOR DATA-DRIVEN HYPOTHESIS EXPLORATION IN SINGLE-CELL TRANSCRIPTOMICS

Artemy Bakulin, Pierre Boyeau, Nir Yosef

Published: 05 Mar 2025, Last Modified: 05 Mar 2025MLGenX 2025 SpotlightEveryoneRevisionsBibTeXCC BY 4.0

Track: Special track (up to 8 pages)

Abstract: Large Language Models (LLMs) have the ability to utlilize expert knowledge and simulate human thinking, which potentially makes them instrumental for a variety of scientific tasks. However, since scientific data is heterogeneous, often presented in the form of unordered tables, bridging the gap between unstructured non-textual data and the language processing capabilities of LLMs remains an open challenge. Agentic AI offers a promising approach by enabling LLMs to interactively query datasets for relevant information. Here, we explore the application of this agentic paradigm to single-cell transcriptomic analysis, with a specific focus on cell type annotation. Our results show that when LLMs are equipped with data-querying capabilities, their performance in annotating cell types improves significantly compared to single-shot prompting. Furthermore, we provide a proof of concept illustration of how our method can be used to integrate diverse single-cell datasets (e.g., cell census), ensuring consistent annotation across multiple sources, facilitating meta-analysis across big sample cohorts.

Submission Number: 61

Loading