Track: Machine learning: computational method and/or computational results
Nature Biotechnology: Yes
Keywords: Siamese neural network, discrete diffusion sampling, protein design
TL;DR: SOAPI integrates a Siamese protein language model with masked diffusion to generate highly specific protein binders while minimizing off-target interactions.
Abstract: Therapeutics that modulate pathogenic proteins while avoiding off-target interactions are essential for effective drug design. However, designing binders that selectively engage a target protein while minimizing interactions with structurally or functionally similar proteins remains a major challenge. To address this, we introduce Siamese-guided strategy for the generation of Off target-Avoiding Protein Interactions, termed SOAPI. SOAPI leverages a Siamese protein language model with an adaptive Log-Sum-Exp Decoy Loss to enforce specificity by embedding fusion-specific binders close to their target while maintaining separation from off-targets. These optimized embeddings then guide a diffusion protein language model (DPLM), which generates binders using soft-value-based decoding (SVDD) and Sequential Monte Carlo resampling to iteratively refine candidates. In silico validation demonstrates significant off-target avoidance, highlighting SOAPI’s potential for generating precise and selective protein interactions.
Anonymization: This submission has been anonymized for double-blind review via the removal of identifying information such as names, affiliations, and identifying URLs.
Presenter: ~Sophia_Vincoff1
Format: Yes, the presenting author will definitely attend in person because they attending ICLR for other complementary reasons.
Funding: No, the presenting author of this submission does *not* fall under ICLR’s funding aims, or has sufficient alternate funding.
Submission Number: 73
Loading