Improving Inverse Folding for Peptide Design with Diversity-regularized Direct Preference Optimization

Ryan Park; Darren J. Hsu; C. Brian Roland; Maria Korshunova; Chen Tessler; Shie Mannor; Olivia Viessmann; Bruno Trentini

Improving Inverse Folding for Peptide Design with Diversity-regularized Direct Preference Optimization

Ryan Park, Darren J. Hsu, C. Brian Roland, Maria Korshunova, Chen Tessler, Shie Mannor, Olivia Viessmann, Bruno Trentini

Published: 11 Jun 2025, Last Modified: 18 Jul 2025GenBio 2025 SpotlightEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Generative biology, preference optimization, inverse folding

TL;DR: Inverse folding via diverse preference optimization

Abstract: Inverse folding models play an important role in structure-based design by predicting amino acid sequences that fold into desired reference structures. Models like ProteinMPNN, a message-passing encoder-decoder model, are trained to reliably produce new sequences from a reference structure. However, when applied to peptides, these models are prone to generating repetitive sequences that do not fold into the reference structure. To address this, we fine-tune ProteinMPNN to produce diverse and structurally consistent peptide sequences via Direct Preference Optimization (DPO). We derive two enhancements to DPO: online diversity regularization and domain-specific priors. Additionally, we develop a new understanding on improving diversity in decoder models. When conditioned on OpenFold generated structures, our fine-tuned models achieve state-of-the-art structural similarity scores, improving base ProteinMPNN by at least 8\%. Compared to standard DPO, our regularized method achieves up to 20\% higher sequence diversity with no loss in structural similarity score.

Submission Number: 70

Loading