Annealed Biological Sequence Optimization

Published: 20 Jun 2023, Last Modified: 11 Oct 2023SODS 2023 PosterEveryoneRevisionsBibTeX
Keywords: Protein Design; Model-based Optimization
Abstract: Designing biological sequences with desired properties is an impactful research problem with various application scenarios such as protein engineering, anti-body design, and drug discovery. Machine learning algorithms could be applied either to fit the property landscape with supervised learning or generatively propose reasonable candidates to reduce wet lab efforts. From the learning perspective, the key challenges lie in the sharp property landscape, i.e. several mutations could dramatically change the protein property and the large biological sequence space. In this paper, we propose annealed sequence optimization (ANSO) and aim to simultaneously take the two main challenges into account by a paired surrogate model training paradigm and sequence sampling procedure. The extensive experiments on a series of protein sequence design tasks have demonstrated the effectiveness over several advanced baselines.
Submission Number: 18
Loading