DisProtEdit: Exploring Disentangled Representations for Multi-Attribute Protein Editing

DisProtEdit: Exploring Disentangled Representations for Multi-Attribute Protein Editing

ICML 2025 Workshop FM4LS Submission16 Authors

Published: 12 Jul 2025, Last Modified: 12 Jul 2025FM4LS 2025EveryoneRevisionsBibTeXCC BY 4.0

Keywords: protein editing, disentanglement

Abstract: We introduce DisProtEdit, a controllable protein editing framework that learns disentangled structural and functional representations via dual-channel natural language supervision. Unlike prior models with joint holistic embeddings, DisProtEdit separates semantics for modular and interpretable control. We construct SwissProtDis, a large multimodal dataset with protein sequences paired with LLM-decomposed structural and functional descriptions. DisProtEdit aligns protein and text embeddings via alignment and uniformity objectives, with a disentanglement loss promoting semantic independence. Editing is performed by modifying one or both text inputs and decoding the updated latent representation. Experiments show that DisProtEdit matches prior methods in accuracy while offering greater interpretability and control. On a new multi-attribute editing benchmark, it achieves up to 61.7\% both-hit success, validating its effectiveness in simultaneous structure-function editing.

Submission Number: 16

Loading