Keywords: ml-guided sequence design, drug discovery, protein engineering, generative modeling, VAE, gene therapy
TL;DR: We design Adeno-associated viruses, a gene therapy delivery vector, with generative modeling. We outperform algorithmic baselines in-vitro, and in a non-human primate experiment, we produce designs with field-leading therapeutic potential.
Abstract: Machine learning-assisted biological sequence design is a topic of intense interest due to its potential impact on healthcare and biotechnology. In recent years many new approaches have been proposed for sequence design through learning from data alone (rather than mechanistic or structural approaches). These black-box approaches roughly fall into two camps: (i) optimization against a learned oracle (ii) sampling designs from a generative model. While both approaches have demonstrated promise, real-world evidence of their effectiveness is limited, whether used alone or in combination. Here we develop a robust generative model named $\texttt{VAEProp}$ and use it to optimize Adeno-associated virus (AAV) capsids, a fundamental gene therapy vector. We show that our method outperforms algorithmic baselines on this design task in the real world. Critically, we demonstrate that our approach is capable of generating vector designs with field-leading therapeutics potential through in-vitro and non-human primate validation experiments.
Submission Number: 69
Loading