Track: Main track
Keywords: LLM, PERTURBATION PREDICTION
Abstract: Predicting gene perturbation effects in unseen contexts is essential for understand-
ing regulatory networks and identifying therapeutic targets. Current methods face
a trade-off: Graph Neural Networks are limited by incomplete databases, while
LLM-based methods confuse textual co-occurrence with true regulatory relation-
ships.
We introduce CausalPert, a framework that uses LLM consensus to infer directed
regulatory relationships, constructing a latent GRN that guides both prediction
and experimental design. CausalPert makes two key changes to existing seman-
tic baselines for LLM-based perturbation prediction. (1) For predicting unseen
perturbation effects, instead of asking an LLM to find ”similar genes,” it prompts
the LLM to identify upstream regulators of a target gene, runs this query three
times independently, and keeps only the candidates that appear consistently. (2)
For selecting which genes to experimentally perturb first, it asks the LLM to nominate the genes most likely to control many regulatory targets, then ranks them by
agreement across runs.
For perturbation prediction (1), our method improves correlation by 10.5% over
semantic baselines in few-shot regimes (N = 50). For experimental design (2),
selecting just 50 anchors via LLM consensus in K562 outperforms network cen-
trality heuristics by up to 46%.
AI Policy Confirmation: I confirm that this submission clearly discloses the role of AI systems and human contributors and complies with the ICLR 2026 Policies on Large Language Model Usage and the ICLR Code of Ethics.
Submission Number: 109
Loading