AutoFold: Ultra-fast Protein Generation via Autoregressive Contact Graph Generation

ICLR 2026 Conference Submission13069 Authors

18 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: protein design, motif scaffolding, graph generative models, autoregressive models
TL;DR: We propose a graph-based autoregressive model for protein sequence and structure co-generation
Abstract: Generative models for protein design, particularly diffusion and flow matching approaches, are powerful but computationally expensive, with slow sampling times that hinder high-throughput applications. We introduce AutoFold, an ultra-fast autoregressive model that generates proteins via a sparse graph representation of their structure. Instead of generating continuous coordinates directly, AutoFold learns a contact graph of the backbone structure. A Vector Quantized Variational Autoencoder is trained to discretize contacting inter-residue geometric features, creating a graph representation with single edge labels invariant to SE(3) transformations. This representation can be decoded to reconstruct a backbone structure with high fidelity. We then train an autoregressive model to generate these graphs, further incorporating amino acid sequence into node attributes. Our trained model can be seamlessly used for both unconditional generation and motif scaffolding. Our results demonstrate that AutoFold achieves performance comparable to state-of-the-art methods while accelerating sampling by over an order of magnitude. By shifting generation from continuous coordinates to discrete graphs, AutoFold opens the door to high-throughput, large-scale protein design applications.
Primary Area: applications to physical sciences (physics, chemistry, biology, etc.)
Submission Number: 13069
Loading