FoldSAE: Learning To Steer Protein Folding Through Sparse Representations
Keywords: Sparse Autoencoders, Diffusion models, backbone desing
TL;DR: We train sparse auteoncoder to find interpretable features in protein design models
Abstract: While models like RFdiffusion excel at generating protein backbones, their "black box" nature currently restricts design to a process of stochastic sampling rather than precise engineering. To bridge this gap, we introduce FoldSAE, a framework that adapts Sparse Autoencoders (SAEs) to decompose RFdiffusion’s dense activations into interpretable, monosemantic features. We demonstrate that these unsupervised features capture fundamental physical properties, including secondary structure formation and solvent-accessible surface area (SASA). Leveraging these insights, we implement a steering mechanism that enables targeted modulation of backbone folding and surface exposure during the denoising process. Our work pioneers a new framework for making RFdiffusion more interpretable, demonstrating how understanding internal features can be directly translated into precise control over the protein design process.
Presenter: ~Wojciech_Zarzecki1
Format: Maybe: the presenting author will attend in person, contingent on other factors that still need to be determined (e.g., visa, funding).
Funding: Yes, the presenting author of this submission falls under ICLR’s funding aims, and funding would significantly impact their ability to attend the workshop in person.
Submission Number: 94
Loading