SOAPIA: Specificity-Guided Generation of Off-Target-Avoiding Protein Interactions with High Target Affinity
Keywords: peptide design, discrete flow matching, protein-protein interactions, binder specificity, fusion oncoproteins
TL;DR: SOAPIA generates peptide binders that hit a target protein while avoiding closely related off-targets, using a Siamese specificity predictor to guide multi-objective discrete flow matching.
Abstract: Therapeutic molecules must selectively interact with a target protein while avoiding structurally or functionally similar off-targets, such as alternate isoforms, point mutants, and homologous family members. As an example, the development of safe therapeutics for fusion-driven cancers requires interacting with the fusion oncoprotein, which arises from a chromosomal translocation, while avoiding its homologous parental head and tail proteins. However, no existing generative strategy explicitly optimizes both target affinity and off-target avoidance. To address this, we introduce **SOAPIA**, a framework for the **S**pecificity-guided generation of **O**ff-target-**A**voiding **P**rotein **I**nteractions with high target **A**ffinity. SOAPIA generates *de novo* peptide binders by steering the generative process of a discrete flow matching model, enabling Pareto-efficient exploration of discrete sequence space without gradient access. Affinity is optimized via a pre-trained predictor, while specificity is enforced using SiameseCat, a novel Siamese Neural Network. SiameseCat identifies isoform-specific, mutant-specific, and fusion oncoprotein-specific interactions at hit rates of up to 43.2%, 28.5%, and 30.7%, respectively, encompassing both protein-protein and peptide-protein interactions. We benchmark SOAPIA on 17 clinically relevant fusion oncoproteins, where it produces binders that preferentially engage the full fusion over its wild-type head and tail counterparts with predicted nanomolar-range affinity. SOAPIA thus offers a general framework for designing selective biologics across isoform-, mutant-, and fusion-specific targets, with promise for currently untreatable diseases.
Email Sharing: We authorize the sharing of all author emails with Program Chairs.
Data Release: We authorize the release of our submission and author names to the public in the event of acceptance.
Submission Number: 147
Loading