RosettaSearch: Multi-Objective Inference-Time Search for Protein Sequence Design
Keywords: protein design, backbone conditioned protein sequence design, inference time search, inference time optimization, protein sequence optimization
TL;DR: LLMs can serve as effective inference-time optimizers for protein sequence design, achieving a 2.5× improvement in design success rate over state-of-the-art single-pass methods within a fixed computational budget, without any model retraining
Abstract: We introduce RosettaSearch, an inference-time multi-objective optimization framework for backbone-conditioned protein sequence design that deploys large language models (LLMs) as generative optimizers. Within a strictly bounded budget of 75 structure prediction calls per design task, RosettaSearch runs a priority-based parallel search that explores multiple candidate trajectories simultaneously, using structured rewards and residue-level feedback from RosettaFold3 to guide controlled exploration and exploitation of the sequence space.
In a large-scale evaluation on $\approx 400$ protein redesign tasks, RosettaSearch achieves 18--68% improvements in structural fidelity metrics over sequences generated by LigandMPNN, translating to a $2.5\times$ improvement in design success rate. Gains are robust under an independent structure predictor (Chai-1) and generalize across two LLM families (o4-mini and Gemini-3), with performance scaling consistently with reasoning capability. We further extend the framework to vision-language models, where rendered images of predicted structures provide spatial feedback that produces richer structural reasoning in the model's chain-of-thought. To our knowledge, this is the first large-scale demonstration that LLMs can serve as effective generative optimizers for backbone-conditioned protein sequence design, yielding systematic gains without any model retraining.
Email Sharing: We authorize the sharing of all author emails with Program Chairs.
Data Release: We authorize the release of our submission and author names to the public in the event of acceptance.
Submission Number: 20
Loading