Contextualizing biological perturbation experiments through language

Published: 11 Oct 2024, Last Modified: 12 Nov 2024Neurips 2024 Workshop FM4Science PosterEveryoneRevisionsBibTeXCC BY 4.0
Keywords: large language models, Perturb-seq, perturbation experiments, knowledge graphs, retrieval-augmented generation, chain of thought prompting
TL;DR: We propose Perturb-seq predictions as a novel set of real-world tasks for large language models and provide a proof-of-concept method with favorable performance.
Abstract: High-content genetic perturbation experiments provide insights into biomolecular pathways at unprecedented resolution, yet experimental and analysis costs pose barriers to their widespread adoption. _In-silico_ modeling of unseen perturbations has the potential to alleviate this burden by leveraging prior knowledge to enable more efficient exploration of the perturbation space. However, current knowledge-graph approaches neglect the semantic richness of the relevant biology, beyond simple adjacency graphs. To enable holistic modeling, we hypothesize that natural language is an appropriate medium for interrogating experimental outcomes and representing biological relationships. We propose PerturbQA as a set of real-world tasks for benchmarking large language model (LLM) reasoning over structured, biological data. PerturbQA is comprised of three tasks: prediction of differential expression and change of direction for unseen perturbations, and gene set enrichment. As a proof of concept, we present SUMMER (SUMMarize, retrievE, and answeR), a simple LLM-based framework that matches or exceeds the current state-of-the-art on this benchmark. We evaluated graph and language-based models on differential expression and direction of change tasks, finding that SUMMER performed best overall. Notably, SUMMER's outputs, unlike models that solely rely on knowledge graphs, are easily interpretable by domain experts, aiding in understanding model limitations and contextualizing experimental outcomes. Additionally, SUMMER excels in gene set enrichment, surpassing over-representation analysis baselines in most cases and effectively summarizing clusters lacking a manual annotation.
Supplementary Material: zip
Submission Number: 82
Loading