Abstract: Protein backbone generation plays a central role in de novo protein design and is significant for many biological and medical applications.
Although diffusion and flow-based generative models provide potential solutions to this challenging task, they often generate proteins with undesired designability and suffer computational inefficiency.
In this study, we propose a novel rectified quaternion flow (ReQFlow) matching method for fast and high-quality protein backbone generation.
In particular, our method generates a local translation and a 3D rotation from random noise for each residue in a protein chain, which represents each 3D rotation as a unit quaternion and constructs its flow by spherical linear interpolation (SLERP) in an exponential format.
We train the model by quaternion flow (QFlow) matching with guaranteed numerical stability and rectify the QFlow model to accelerate its inference and improve the designability of generated protein backbones, leading to the proposed ReQFlow model.
Experiments show that ReQFlow achieves on-par performance in protein backbone generation while requiring much fewer sampling steps and significantly less inference time (e.g., being 37$\times$ faster than RFDiffusion and 63$\times$ faster than Genie2 when generating a backbone of length 300), demonstrating its effectiveness and efficiency.
Lay Summary: Imagine creating a 3D sculpture with small interconnected pieces. This process is similar to designing new proteins. Proteins are essential for life and significant for many biological and medical applications. Generating the skeleton or "backbone" of proteins is a crucial step, but existing computer methods are often slow and can't produce high-quality designs.
Our ReQFlow teaches the computer to design protein backbones more efficiently and accurately, with a clever mathematical approach (based on "quaternions" and a "rectified flow" technique). The results show that we are up to 63 times faster than some current strong methods—and produce better quality protein backbones, even for very long ones that were previously difficult to design.
Our findings could speed up the discovery of new drugs, help create enzymes for biocatalysis, and unlock the potential in designing advanced biological materials.
Application-Driven Machine Learning: This submission is on Application-Driven Machine Learning.
Link To Code: https://github.com/AngxiaoYue/ReQFlow
Primary Area: Applications->Chemistry, Physics, and Earth Sciences
Keywords: protein backbone generation, quaternion representation, flow matching
Flagged For Ethics Review: true
Submission Number: 6342
Loading