Keywords: protein inverse folding, fractal structure modeling, protein design, protein language models
TL;DR: FractalFold leverages protein fractal properties for hierarchical inverse folding, achieving state-of-the-art sequence recovery rate, perplexity, and foldability.
Abstract: Inverse protein folding aims to design amino acid sequences that fold into desired backbone structures, representing a long-standing challenge in computational protein design. While recent deep learning approaches have achieved significant progress, existing methods predominantly treat protein structures as flattened sequences, overlooking their inherent hierarchical and fractal organization. To address this limitation, we propose FractalFold, a novel transformer-based model that performs structure-informed inverse folding by recursively invoking multi-level atomic fractal transformers. FractalFold employs a coarse-to-fine sequence refinement paradigm that mirrors the intrinsic hierarchical nature of protein structures. To generalize our approach to quasi-fractal proteins with variable-length structural segments, we introduce the Hierarchical Fractal Segmentation Module (HFSM), which leverages attention patterns from pre-trained protein language models to recursively partition protein structures into tree-organized patches. Extensive experiments on the CATH benchmarks demonstrate that FractalFold achieves state-of-the-art performance in sequence recovery rate and perplexity while generating sequences with enhanced foldability, establishing a new paradigm for structure-informed protein design.
Supplementary Material: zip
Primary Area: applications to physical sciences (physics, chemistry, biology, etc.)
Submission Number: 13885
Loading