Track: Machine learning: computational method and/or computational results
Keywords: enzyme, enzyme screening, biosynthesis, chemical reactions, contrastive learning, enzyme commission
TL;DR: We define the task of in silico enzyme screening, and we present a method based on contrasting enzyme and reaction representations.
Abstract: Computational screening of naturally occurring proteins has the potential to identify efficient catalysts among the hundreds of millions of sequences that remain uncharacterized. Current experimental methods remain time, cost and labor intensive, limiting the number of enzymes they can reasonably screen. In this work, we propose a computational framework for in-silico enzyme screening. Through a contrastive objective, we train CLIPZyme to encode and align representations of enzyme structures and reaction pairs. With no standard computational baseline, we compare CLIPZyme to existing EC (enzyme commission) predictors applied to virtual enzyme screening and show improved performance in scenarios where limited information on the reaction is available (BEDROC$_{85}$ of 44.69%). Additionally, we evaluate combining EC predictors with CLIPZyme and show its generalization capacity on both unseen reactions and protein clusters.
Submission Number: 52
Loading