Reading TEA leaves for de novo protein design
Keywords: Monte Carlo sampling, protein design, protein language models
TL;DR: We leveraged a discrete structural proxy derived from protein language models, enabling random mutagenesis MCMC to rapidly navigate the protein landscape.
Abstract: De novo protein design expands the functional protein universe beyond natural evolution, offering vast therapeutic and industrial potential.
Monte Carlo sampling in protein design is under-explored due to the typically long simulation times required or prohibitive time requirements of current structure prediction oracles. Here we make use of a 20-letter structure-inspired alphabet derived from protein language model embeddings to score random mutagenesis-based Metropolis sampling of amino acid sequences. This facilitates fast template-guided and unconditional design, generating sequences that satisfy in silico designability criteria without known homologues. Ultimately, this unlocks a new path to fast and de novo protein design.
Presenter: ~Lorenzo_Pantolini1
Format: Yes, the presenting author will attend in person if this work is accepted to the workshop.
Funding: No, the presenting author of this submission does not fall under ICLR’s funding aims, or has sufficient alternate funding.
Submission Number: 45
Loading