Keywords: Protein Structure, Generative Modeling
Abstract: Current protein generative models often prioritize computational efficiency over fidelity to biological mechanisms, leading to artifacts such as mode collapse into helical structures that are difficult to diagnose and correct. We hypothesize that generative processes more closely aligned with authentic biological pathways can produce more diverse and unbiased outputs. To this end, we propose a generative model that combines internal coordinate parameterization with a novel trans-dimensional diffusion process inspired by ribosomal protein synthesis and co-translational folding. The model incrementally elongates the polypeptide chain while allowing nascent residues to fold, enabling early segments to explore diverse substructures and later segments to condition on partially folded contexts. In addition, our model supports length-independent flexible generation, allowing protein size to emerge dynamically during sampling and removing the inherent bias introduced by prespecified lengths. Empirically, our approach achieves superior in-distribution coverage and secondary structure balance without finetuning compared to state-of-the-art baselines.
Submission Number: 110
Loading