CAMP: COMBINATORIAL ENGINEERING OF PROTEINS

Published: 06 Mar 2025, Last Modified: 28 Apr 2025GEMEveryoneRevisionsBibTeXCC BY 4.0
Track: Machine learning: computational method and/or computational results
Nature Biotechnology: No
Keywords: AI, Machine Learning, Biology, Protein Engineering, LLMs
TL;DR: Recombinant proteins for reduced immunogenicity using Protein Language Models
Abstract: Protein recombination has long been a key method in protein engineering to diver- sify and optimize sequences. We enhance and evolve this approach by using a pro- tein language model, where we found that when attention in the language model is represented as a spline, abrupt transitions in the spline identify optimal crossover sites for recombination. As we show, these sites also correlate with transitions between various secondary structure elements in the corresponding protein struc- ture. We use these sites to guide recombination of sequence blocks from diverse sources using MCMC sampling. Language models also enable generation of novel recombinant blocks beyond traditional MSAs, increasing diversity, while a direct preference optimization algorithm is used to fine-tune these blocks for reduced immunogenicity. This method integrates modern deep learning architectures with traditional protein engineering techniques to improve success rate of the libraries designed for wetlab verification.
Anonymization: This submission has been anonymized for double-blind review via the removal of identifying information such as names, affiliations, and identifying URLs.
Submission Number: 121
Loading