everyone
since 29 Apr 2024">EveryoneRevisionsBibTeXCC BY 4.0
Engineering biocatalysts is central for sustainable chemical synthesis, but hampered by a lack of sequence-function data which is costly and slow to obtain. We introduce a new microfluidic workflow, droplet lrDMS, which allows us to screen tens of thousands of enzyme variants within two weeks, a scale, speed and cost not feasible with plate screening or robotic workflows. Using this workflow, we generate large-scale sequence-function data of an imine reductase and rationally engineer improved variants with an up to 11-fold improvement in catalytic efficiency ($k_\text{cat}/K_M$) vs wild type. With machine learning, we further enhance catalytic efficiency up to 16-fold vs wild type, 4-fold better than the best variant in the dataset, by combining rational engineering and predictions from the AI model. The improvement is driven by a 24-fold improvement of catalytic rate ($k_\text{cat}$) over wild type significantly higher than rate improvements observed in an AI-informed campaign with a similar enzyme. Our study demonstrates the potential of droplet lrDMS sequence-function data to accelerate directed evolution by AI-informed biocatalyst engineering.