S2S2Fun: Decoding Protein Function From Latent Structural Representations
Keywords: AlphaFold3 representations, Protein function prediction, Optical properties, Domain-adversarial training, Learning-to-rank, Latent representation learning, Mutational effect prediction
TL;DR: We show that AlphaFold3’s internal structural representations encode functional signals. Using them with ranking and domain-adversarial learning enables more accurate prediction of ligand-dependent protein functions.
Abstract: Predicting mutational effects on protein function from sequences alone remains an unsolved challenge, despite its importance for protein engineering. Protein functions such as enzymatic activity are highly sensitive to mutations in a structure-dependent manner. Recent advances in structure prediction including AlphaFold3 and its open-source counterparts have enabled atomic-level modeling of biomolecular complexes. We hypothesize that AlphaFold3’s latent structural features of protein--ligand complexes can be harnessed for decoding functional differences of sequence variants. Focusing on the optical properties of light-sensitive proteins, we demonstrate that AlphaFold3 $pair$ and $single$ representations can effectively predict absorption peaks, fluorescence brightness, and protein stability of natural and de novo designed proteins. Our ''sequence-to-structure-to-function (S2S2Fun)'' approach offers an effective method for ranking protein function and provides an in silico metric for metagenomic protein discovery and protein engineering applications.
Presenter: ~Ge_Tian1
Format: Yes, the presenting author will attend in person if this work is accepted to the workshop.
Funding: Yes, the presenting author of this submission falls under ICLR’s funding aims, and funding would significantly impact their ability to attend the workshop in person.
Submission Number: 29
Loading