Sequence-context-aware decoding enables robust reconstruction of protein dynamics from crystallographic B-factors

Published: 04 Mar 2026, Last Modified: 07 Mar 2026ICLR 2026 Workshop LMRL PosterEveryoneRevisionsBibTeXCC BY 4.0
Confirmation: I have read and agree with the workshop's policy on behalf of myself and my co-authors.
Track: long paper (4–8 pages excluding references)
Keywords: Protein dynamics; Crystallographic B-factors; Protein language models; Deep learning; Structural biology
TL;DR: Our language model decodes solution-state protein dynamics from static crystallographic B-factors, correcting lattice artifacts to improve docking and interface refinement.
Abstract: While X-ray crystallography remains the primary source of protein structures, the B-factors associated with these coordinates are frequently obscured by crystal packing and refinement artifacts that limit their utility for quantifying solution-state dynamics. We address the disparity between static structural abundance and dynamic information scarcity by introducing the B-Factor Corrector. This fine-tuned protein language model redefines B-factor analysis as a sequence-to-dynamics translation task rather than a purely physical calculation. By leveraging deep contextual embeddings to decouple intrinsic flexibility from lattice constraints, our model recovers ground-truth conformational fluctuations derived from structural ensembles and achieves a Pearson correlation of 0.80. We further demonstrate the utility of these sequence-inferred dynamics in structural biology applications where the model identifies binding-competent conformations for flexible molecular docking, unmasks cryptic mechanical hinges essential for chaperone function, and guides the energetic refinement of antibody interfaces to resolve steric conflicts. These results suggest that static crystallographic data can be effectively repurposed to decode accurate dynamic patterns latent within the Protein Data Bank.
Anonymization: This submission has been anonymized for double-blind review via the removal of identifying information such as names, affiliations, and identifying URLs.
Submission Number: 25
Loading