Sample-efficient Antibody Design through Protein Language Model for Risk-aware Batch Bayesian Optimization

Published: 28 Oct 2023, Last Modified: 09 Nov 2023NeurIPS2023-AI4Science PosterEveryoneRevisionsBibTeX
Keywords: Antibody design, Bayesian optimization, Generative language model
Abstract: Antibody design is a time-consuming and expensive process that often requires extensive experimentation to identify the best candidates. To address this challenge, we propose an efficient and risk-aware antibody design framework that leverages protein language models (PLMs) and batch Bayesian optimization (BO). Our framework utilizes the generative power of protein language models to predict candidate sequences with higher naturalness and a Bayesian optimization algorithm to iteratively explore the sequence space and identify the most promising candidates. To further improve the efficiency of the search process, we introduce a risk-aware approach that balances exploration and exploitation by incorporating uncertainty estimates into the acquisition function of the Bayesian optimization algorithm. We demonstrate the effectiveness of our approach through experiments on several benchmark datasets, showing that our framework outperforms state-of-the-art methods in terms of both efficiency and quality of the designed sequences. Our framework has the potential to accelerate the discovery of new antibodies and reduce the cost and time required for antibody design.
Submission Track: Original Research
Submission Number: 18