Keywords: protein editing, disentanglement
Abstract: We introduce DisProtEdit, a controllable protein editing framework that learns disentangled structural and functional representations via dual-channel natural language supervision. Unlike prior models with joint holistic embeddings, DisProtEdit separates semantics for modular and interpretable control. We construct SwissProtDis, a large multimodal dataset with protein sequences paired with LLM-decomposed structural and functional descriptions. DisProtEdit aligns protein and text embeddings via alignment and uniformity objectives, with a disentanglement loss promoting semantic independence. Editing is performed by modifying one or both text inputs and decoding the updated latent representation. Experiments show that DisProtEdit matches prior methods in accuracy while offering greater interpretability and control. On a new multi-attribute editing benchmark, it achieves up to 61.7\% both-hit success, validating its effectiveness in simultaneous structure-function editing.
Submission Number: 16
Loading