Track: Main track
Keywords: Spatial transcriptomics, zero-shot annotation, training-free inference, structured prediction, prototype-based methods, large language model verification, constrained reasoning, spatial graphs, neighborhood context, cross-platform generalization, interpretability
TL;DR: A training-free spatial transcriptomics annotator that uses biologically grounded prototypes and selective, ontology-constrained LLM verification to achieve robust, interpretable labeling across platforms
Abstract: Spatial transcriptomics enables the analysis of cellular organization by measur-
ing gene expression in situ, but assigning coherent spatial region labels remains
challenging across platforms due to heterogeneous resolution, incomplete marker
panels, and ambiguous boundaries. Existing approaches typically rely on super-
vised training, dataset-specific tuning, or deep graph models, which can oversmooth
structure, generalize poorly across technologies, and offer limited interpretability.
We introduce NicheAgent, a training-free structured prediction framework that casts
spatial annotation as a constrained decision problem with selective language-based
verification. NicheAgent first performs deterministic prototype-based assignment
using curated region prototypes (“nichecards”) encoding canonical marker genes
and expression centroids. Only for low-confidence cases, a lightweight large
language model (LLM) is invoked as a closed-world verifier, arbitrating among
a fixed set of candidate labels using marker semantics and local neighborhood
context under a strict ontology. A single round of spatial smoothing enforces local
coherence without blurring anatomical boundaries.
Across Visium, MERFISH, and STARmap datasets, NicheAgent consistently out-
performs supervised, graph-based, and prior LLM-driven methods on standard
spatial annotation metrics, while remaining transparent and interpretable. More
broadly, our results highlight a general design pattern in which LLMs act as con-
strained adjudicators over symbolic hypotheses, improving structured prediction in
high-ambiguity settings without end-to-end learning or loss of interpretability.
AI Policy Confirmation: I confirm that this submission clearly discloses the role of AI systems and human contributors and complies with the ICLR 2026 Policies on Large Language Model Usage and the ICLR Code of Ethics.
Submission Number: 84
Loading