ML-GUIDED MINING OF AN EXTENSIVELY VALIDATED SCFV LIBRARY FOR OPEN-SOURCE ENZYMES IN DIAGNOSTICS

Published: 02 Mar 2026, Last Modified: 05 Mar 2026GEM 2026EveryoneRevisionsBibTeXCC BY 4.0
Keywords: Machine learning Deep mutational scanning, Protein engineering, Sequence-to-function prediction
TL;DR: Open-source hot-start enzyme regulators: discover scFv inhibitors that reversibly control DNA polymerases with commercial-grade performance, using low-cost assays. Scale via NGS+DMS to train ML models to design inhibitors for new enzymes.
Abstract: Cost and IP barriers limit access to high-performance enzyme start/stop modifiers such as hot-start systems that suppress premature activity during reaction setup. We combine an accessible 10^10 human scFv phage-display library with activity-linked screening to engineer open, recombinant enzyme regulators. Using low-cost fluorescence workflows (in-house dye synthesis with 67-86× cost reduction), we identify scFv inhibitors that convert standard polymerases into hot-start formulations. In head-to-head benchmarking against commercial hot-start enzymes, scFv-regulated polymerases achieve commercial-grade suppression during setup with heat-triggered recovery and robust amplification. To scale beyond individual hits, we outline a data-centric pipeline: NGS-tracked selections yielding more than 10^4 binder/non-binder sequences per target and deep mutational scanning of lead scFvs (5,000-20,000 variants) to map CDR-level inhibitory fitness landscapes for predictive design. We highlight prospective extensions to ligases, restriction enzymes, and CRISPR-Cas systems.
Presenter: ~Aishwarya_Venkatramani1
Format: Maybe: the presenting author will attend in person, contingent on other factors that still need to be determined (e.g., visa, funding).
Funding: Yes, the presenting author of this submission falls under ICLR’s funding aims, and funding would significantly impact their ability to attend the workshop in person.
Submission Number: 66
Loading