AutoFold: Ultra-fast Protein Generation via Autoregressive Contact Graph Generation

18 Sept 2025 (modified: 11 Feb 2026)Submitted to ICLR 2026EveryoneRevisionsBibTeXCC BY 4.0
Keywords: protein design, motif scaffolding, graph generative models, autoregressive models
TL;DR: We propose a graph-based autoregressive model for protein sequence and structure co-generation
Abstract: Generative models for protein design, particularly diffusion and flow matching approaches, are powerful but computationally expensive, with slow sampling times that hinder high-throughput applications. We introduce AutoFold, an ultra-fast autoregressive model that generates proteins via a sparse graph representation of their structure. Instead of generating continuous coordinates directly, AutoFold learns a contact graph of the backbone structure. A Vector Quantized Variational Autoencoder is trained to discretize contacting inter-residue geometric features, creating a graph representation with single edge labels invariant to SE(3) transformations. This representation can be decoded to reconstruct a backbone structure with high fidelity. We then train an autoregressive model to generate these graphs, further incorporating amino acid sequence into node attributes. Our trained model can be seamlessly used for both unconditional generation and motif scaffolding. Our results demonstrate that AutoFold achieves unique co-designability comparable to state-of-the-art methods while accelerating sampling by 2-3 times and generating highly diverse and novel proteins. By shifting generation from continuous coordinates to discrete graphs, AutoFold introduces a novel modeling paradigm for protein design.
Primary Area: applications to physical sciences (physics, chemistry, biology, etc.)
Submission Number: 13069
Loading