Keywords: Protein Design, Large Language Models (LLMs), Multi-Agent Systems, AlphaFold2, Generative Biology, Autonomous AI, Synthetic Biology, De Novo Sequence Generation, AI-Driven Scientific Discovery, Agent Mode
TL;DR: Autonomous LLM agents cooperatively generate and prioritize synthetic protein sequences, producing structure-predictive candidates with minimal human input.
Abstract: We present a modular, multi-agent generative framework for de novo protein sequence design and prioritization, developed and executed primarily by autonomous AI agents. The system uses cooperative large language models (LLMs) to synthesize amino acid segments in parallel, with each agent responsible for a subsequence. A downstream aggregation and refinement stage produces complete sequences, which are then filtered and ranked using interpretable biophysical heuristics. We generate 100 proteins using this workflow and evaluate their plausibility through property distributions, unsupervised clustering, and AlphaFold2-based structural prediction. Despite operating without evolutionary templates or functional labels, several top-ranked candidates display moderate structural confidence (mean pLDDT > 60, pDockQ > 0.5), suggesting that LLMs encode useful compositional priors. Our results support the use of agentic LLM architectures, paired with lightweight scoring and minimal human intervention, as a scalable strategy for upstream protein design pipelines.
Supplementary Material: zip
Submission Number: 326
Loading