FoldSAE: Learning To Steer Protein Folding Through Sparse Representations
Keywords: Sparse Autoencoders, Diffusion models, backbone desing
TL;DR: We train sparse auteoncoder to find interpretable features in protein design models
Abstract: While models like RFdiffusion excel at generating protein backbones, their "black box" nature currently restricts design to a process of stochastic sampling rather than precise engineering. To bridge this gap, we introduce FoldSAE, a framework that adapts Sparse Autoencoders (SAEs) to decompose RFdiffusion’s dense activations into interpretable, monosemantic features. We demonstrate that these unsupervised features capture fundamental physical properties, including secondary structure formation and solvent-accessible surface area (SASA). Leveraging these insights, we implement a steering mechanism that enables targeted modulation of backbone folding and surface exposure during the denoising process. Our work pioneers a new framework for making RFdiffusion more interpretable, demonstrating how understanding internal features can be directly translated into precise control over the protein design process.
Submission Number: 94
Loading