Advancing Knotted Protein Design with ESM3: Guided Generation and Topological Insights

ICML 2025 Workshop FM4LS Submission55 Authors

Published: 12 Jul 2025, Last Modified: 12 Jul 2025FM4LS 2025EveryoneRevisionsBibTeXCC BY 4.0
Keywords: Protein Design, Knotted Proteins, ESM3, Multi-modal Foundation Models, Guided Generation
TL;DR: Using ESM3, we greatly improved the success rate of designing knotted proteins and developed a new method to measure knot stability, showing that knots remain intact even when most of the protein sequence is altered.
Abstract: Knotted proteins represent a rare but functionally important class of proteins with complex topological features. While previous work demonstrated the generation of artificial knotted proteins using diffusion models and sequence design tools, these approaches suffered from low success rates (~0.5%). We present a novel approach leveraging ESM3, a multi-modal protein language model, to achieve guided generation of knotted proteins with an 87% success rate. We introduce a continuous knot score metric that captures the robustness of protein knots, revealing that approximately 85% of a protein sequence must be altered to break its knot. Using ESM3 embeddings, we achieve 93% accuracy in knotted protein classification and demonstrate the ability to convert unknotted proteins to knotted variants through iterative modifications (31% success rate). Our work showcases the power of unified multi-modal models in tackling complex protein design challenges.
Submission Number: 55
Loading