everyone
since 04 Oct 2024">EveryoneRevisionsBibTeXCC BY 4.0
Enzymes are crucial catalysts for biochemical reactions, underpinning numerous biological processes. The efficient identification of specific enzymes from extensive protein libraries is essential for understanding and harnessing these biological reactions. While traditional computational methods for enzyme screening are time-consuming and resource-intensive, recent contrastive learning approaches have shown promise. However, these methods often overlook the inherent hierarchical classifications within enzymes and reactions, as well as the significance of molecular structure in catalysis. To address these limitations, we introduce FGW-CLIP, a novel contrastive learning framework based on optimizing the fused Gromov-Wasserstein distance. This approach incorporates multiple alignments, including representation alignment between reactions and enzymes, and internal alignment within enzyme and reaction representations. By introducing a regularization term, our method minimizes the Gromov-Wasserstein distance between enzyme and reaction spaces, enhancing information exchange within these domains. FGW-CLIP demonstrates superior performance on the widely-used EnzymeMap benchmark, significantly outperforming existing methods in enzyme virtual screening tasks. Notably, it achieves state-of-the-art results in both BEDROC and EF metrics, indicating its efficacy in identifying relevant enzymes for given reactions. These results highlight the potential of our method to advance virtual enzyme screening, offering a powerful tool for enzyme discovery and characterization.