PlantPhenoLM: Phenotype-Genotype Mapping Inference with Multi-Turn LLM Reasoning and Selective Prediction
Keywords: Phenotype-genotype mapping, Large Language Models, High-throughput phenotyping, Retrieval-augmented generation, Selective prediction, Evidence-grounded reasoning
Abstract: Accurate genotype prediction of plants from their high-throughput phenotypic measurements has great potential to accelerate breeding workflows. However, practical deployment requires more than predictions - practitioners need calibrated confidence, evidence-based explanations, and safe avoidance when the phenotype evidence is ambiguous.
We introduce PlantPhenoLM, a novel algorithm that wraps a standard phenotype classifier with (i) retrieval-based evidence from phenotypically similar plants and (ii) a Large Language Model (LLM)-based reasoning layer.
PlantPhenoLM implements an explicit evidence-fusion score-based selective prediction policy for a reliable and interpretable outcome. Across cross-validation (aggregated $n{=}42$ held-out plants), PlantPhenoLM achieves strong top-$k$ recovery (top-5 $\approx 0.95$ across modes) and modest gains in top-1 accuracy, demonstrating the efficacy of the algorithm.
Submission Number: 112
Loading