Mechanisms of AI Protein Folding in ESMFold

Published: 02 Mar 2026, Last Modified: 26 May 2026GEM 2026EveryoneRevisionsBibTeXCC BY 4.0
Keywords: protein folding, mechanistic interpretability, activation patching, folding trunk
TL;DR: We trace how ESMFold folds a beta-hairpin, discovering that early trunk blocks transfer biochemical signals into pairwise representations while late blocks develop spatial features, and both mechanisms are causally manipulable via steering.
Abstract: How do protein structure prediction models fold proteins? We investigate this question by tracing how ESMFold folds a beta hairpin, a prevalent structural motif. Through counterfactual interventions on model latents, we identify two computational stages in the folding trunk. In the first stage, early blocks initialize pairwise biochemical signals: residue identities and associated biochemical features like charge flow from sequence representations into pairwise representations. In the second stage, late blocks develop pairwise spatial features: distance and contact information accumulate in the pairwise representation. We demonstrate that the mechanisms underlying structural decisions of ESMFold can be localized, traced through interpretable representations, and manipulated with strong causal effects.
Presenter: ~Jannik_Brinkmann1
Format: Maybe: the presenting author will attend in person, contingent on other factors that still need to be determined (e.g., visa, funding).
Funding: No, the presenting author of this submission does not fall under ICLR’s funding aims, or has sufficient alternate funding.
Submission Number: 76
Loading