RosettaSearch: Multi-Objective Inference-Time Search for Protein Sequence Design

Meghana Kshirsagar; Allen Nie; Ching-An Cheng; Fanglei Xue; Rahul M Dodhia; Juan M Lavista Ferres; Kevin K Yang; Frank DiMaio

RosettaSearch: Multi-Objective Inference-Time Search for Protein Sequence Design

Meghana Kshirsagar, Allen Nie, Ching-An Cheng, Fanglei Xue, Rahul M Dodhia, Juan M Lavista Ferres, Kevin K Yang, Frank DiMaio

30 Apr 2026 (modified: 28 May 2026)Submitted to ICML 2026 FM4LS WorkshopEveryoneRevisionsBibTeXCC BY 4.0

Keywords: protein design, backbone conditioned protein sequence design, inference time search, inference time optimization, protein sequence optimization

TL;DR: LLMs can serve as effective inference-time optimizers for protein sequence design, achieving a 2.5× improvement in design success rate over state-of-the-art single-pass methods within a fixed computational budget, without any model retraining

Abstract: We introduce RosettaSearch, an inference-time multi-objective optimization framework for backbone-conditioned protein sequence design that deploys large language models (LLMs) as generative optimizers. Within a strictly bounded budget of 75 structure prediction calls per design task, RosettaSearch runs a priority-based parallel search that explores multiple candidate trajectories simultaneously, using structured rewards and residue-level feedback from RosettaFold3 to guide controlled exploration and exploitation of the sequence space. In a large-scale evaluation on $\approx 400$ protein redesign tasks, RosettaSearch achieves 18--68% improvements in structural fidelity metrics over sequences generated by LigandMPNN, translating to a $2.5\times$ improvement in design success rate. Gains are robust under an independent structure predictor (Chai-1) and generalize across two LLM families (o4-mini and Gemini-3), with performance scaling consistently with reasoning capability. We further extend the framework to vision-language models, where rendered images of predicted structures provide spatial feedback that produces richer structural reasoning in the model's chain-of-thought. To our knowledge, this is the first large-scale demonstration that LLMs can serve as effective generative optimizers for backbone-conditioned protein sequence design, yielding systematic gains without any model retraining.

Email Sharing: We authorize the sharing of all author emails with Program Chairs.

Data Release: We authorize the release of our submission and author names to the public in the event of acceptance.

Submission Number: 20

Loading