Track: Machine learning: computational method and/or computational results
Keywords: antibody design, protein design, diffusion, pLMs, ESM
Abstract: There was a significant progress in protein design using deep learning approaches. The majority of methods predict sequences for a given structure. Recently, diffusion approaches were developed for generating protein backbones. However, $\textit{de novo}$ design of epitope-specific antibody binders remains an unsolved problem due to the challenge of simultaneous optimization of the antibody sequence, variable loop structures, and antigen binding. Here we present, EAGLE (Epitope-specific Antibody Generation using Language model Embeddings), a diffusion-based model that does not require input backbone structures. The full antibody sequence (constant and variable regions) is designed in the continuous space using protein language model embeddings. Similarly to denoising diffusion probabilistic models for image generation that condition the sampling on a text prompt, here we condition the sampling of antibody sequences on antigen structure and epitope amino acids. The model is trained on the available antibody and antibody-antigen structures, as well as antibody sequences. Our Top-100 designs include sequences with 55% identity to known binders for the most variable heavy chain loop. EAGLE's high performance is achieved by tailoring the method specifically for antibody design through integration of continuous latent space diffusion and sampling conditioned on antigen structure and epitope amino acids. Our model enables generating a wide range of diverse, unique, variable loop length antibody binders using straightforward epitope specifications.
Submission Number: 15
Loading