ResistIA: Reasoning-Guided Agentic Evaluation of Synthetic Metal-Resistance Genes from Conditional Genomic Foundation Models
Track: long paper (up to 10 pages)
Keywords: Genomic foundation models, Agentic evaluation, Metal-resistance gene generation, Tool-augmented scientific reasoning
TL;DR: ResistIA is an agentic, validation-aware framework that refines synthetic metal-resistance genes from a LoRA-tuned genomic model using DIAMOND, ORF/physchem checks, and ChatNT evidence.
Abstract: Genomic foundation models can generate plausible coding DNA, but turning raw generations into scientifically useful candidates requires reliable post hoc evaluation and robust model-selection criteria. We present ResistIA, a reasoning-guided agentic evaluation and refinement framework for synthetic metal-resistance genes generated from a LoRA-adapted GenomeOcean-500M model.
ResistIA combines three complementary evaluators---(i) DIAMOND blastx protein homology, (ii) ORF/translation/physicochemical checks, and (iii) ChatNT-based semantic/regulatory Q\&A with embedding similarity---under an auditable decision policy that selects tools based on runtime availability and evidence needs. We study three progressively stronger versions of the framework: V1 (baseline multi-tool evaluation), V2 (closed-loop reweighting over generated batches), and V3 (validation-aware closed-loop optimization with score smoothing and early stopping).
Across representative and multi-seed benchmarks, V3 improves reliability of iteration selection and shows stable validation behavior (best validation composite score $0.2505\pm0.0054$, 95\% CI: $[0.2458,0.2553]$) with high ORF success ($0.9662\pm0.0058$, 95\% CI: $[0.9611,0.9713]$). Validation DIAMOND pass rate remains lower and more variable ($0.2217\pm0.0459$, 95\% CI: $[0.1815,0.2619]$), highlighting homology robustness as the main source of run-to-run variation. In paired 10-seed fixed-vs-adaptive comparisons, we do not observe robust adaptive superiority, motivating a conservative reasoning-agent framing.
These results position ResistIA as a compact, reproducible testbed for reasoning over heterogeneous biological evidence in agentic scientific workflows, and a practical framework for evaluator-guided refinement of genomic sequence generators.
Anonymization: This submission has been anonymized for double-blind review via the removal of identifying information such as names, affiliations, and identifying URLs.
Funding: Yes, the presenting author of this submission falls under ICLR’s funding aims, and funding would significantly impact their ability to attend the workshop in person.
Submission Number: 126
Loading