TL;DR: A state-of- the-art all-atom protein generative model.
Abstract: We introduce Pallatom, an innovative protein generation model capable of producing protein structures with all-atom coordinates. Pallatom directly learns and models the joint distribution $P(\textit{structure}, \textit{seq})$ by focusing on $P(\textit{all-atom})$, effectively addressing the interdependence between sequence and structure in protein generation. To achieve this, we propose a novel network architecture specifically designed for all-atom protein generation. Our model employs a dual-track framework that tokenizes proteins into token-level and atomic-level representations, integrating them through a multi-layer decoding process with "traversing" representations and recycling mechanism. We also introduce the $\texttt{atom14}$ representation method, which unifies the description of unknown side-chain coordinates, ensuring high fidelity between the generated all-atom conformation and its physical structure. Experimental results demonstrate that Pallatom excels in key metrics of protein design, including designability, diversity, and novelty, showing significant improvements across the board. Our model not only enhances the accuracy of protein generation but also exhibits excellent sampling efficiency, paving the way for future applications in larger and more complex systems.
Lay Summary: Pallatom introduces a novel protein generation model that directly learns the joint distribution of all-atom coordinates P(all-atom), integrating sequence and structure co-design. Key innovations include the atom14 representation, which standardizes side-chain atoms via virtual placements, and a dual-track network with residue/atomic tokenization and recycling mechanisms. Experiments show Pallatom outperforms existing methods in designability, structural diversity, and novelty, while maintaining high sampling efficiency. By modeling atomic coordinates holistically, Pallatom eliminates separate sequence/structure steps, enabling accurate and diverse protein generation.
Link To Code: https://github.com/levinthal/Pallatom
Primary Area: Applications->Health / Medicine
Keywords: Proteins, Generative models, Co-design, All-atom
Submission Number: 4192
Loading