Iterative Multi-Objective Policy Optimization for Antibody Sequence Design

19 Sept 2025 (modified: 11 Feb 2026)Submitted to ICLR 2026EveryoneRevisionsBibTeXCC BY 4.0
Keywords: antibody sequence design, multi-objective policy optimization
Abstract: Antibodies are among the most important medicines in use today, yet their development is constrained by costly and labor-intensive affinity maturation. Computational antibody design offers a scalable alternative, but faces two central challenges: the lack of reliable affinity labels and the need to balance binding affinity with structural fidelity (self-consistency of refolded sequences to the original backbone). In this work, we formulate antibody sequence design as a multi-objective optimization problem and develop an iterative policy optimization framework tailored to this setting. To approximate experimental binding affinity, we construct a surrogate reward by regressing wet-lab $\Delta \Delta G$ measurements against Rosetta-derived interface metrics, including shape complementarity, buried surface area, and interfacial hydrogen bonds. To preserve structural fidelity, we introduce self-consistency RMSD as a complementary objective. Our method performs iterative training with a regression loss derived from the KL-regularized policy optimization objective, enabling stable on-policy learning under expensive structural evaluations and progressively guiding the policy toward Pareto-efficient trade-offs between binding affinity and structural fidelity. Across diverse antigen targets, this approach yields antibody sequences that achieve improved binding affinity while maintaining structural consistency, advancing computational antibody design toward practical therapeutic application.
Primary Area: applications to physical sciences (physics, chemistry, biology, etc.)
Submission Number: 14980
Loading