Symmetric Dual-Path Integration for Protein Inverse Folding

01 Sept 2025 (modified: 15 Nov 2025)ICLR 2026 Conference Withdrawn SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Protein Inverse Folding, Multimodal Protein Language Models, Protein Language Models
TL;DR: DualFold introduces a symmetric dual-path architecture that combines protein language models and multimodal protein language models to achieve state-of-the-art performance in protein inverse folding.
Abstract: Protein inverse folding aims to recover amino acid sequences for a given 3D protein structure, underpinning broad applications such as enzyme engineering and drug discovery.Current methods often follow a serial pipeline, in which a structure encoder predicts a coarse sequence, which is then refined by protein language models (PLMs). However, because PLMs only perform post-hoc sequence edits, the refinement is bounded by the quality of upstream predictions.Thanks to recent multimodal protein language models (MPLMs), we could directly encode structure to generate sequences with pretrained structural knowledge, but we observe that they are not effective for inverse folding. Therefore, we introduce a harmonic dual-path architecture that both leverages PLMs for pretrained sequence knowledge and MPLMs for pretrained structural knowledge to iteratively guide protein sequence generation.Through extensive experiments across standard protein inverse folding benchmarks, our method achieves state-of-the-art performance, surpassing prior approaches, and ablation studies validate the rationale of our symmetric design, revealing a promising direction for the community.
Primary Area: applications to physical sciences (physics, chemistry, biology, etc.)
Submission Number: 455
Loading