ReactEmbed: A Plug-and-Play Module for Unifying Protein-Molecule Representations Guided by Biochemical Reaction Networks
Abstract: The computational representation of proteins and molecules is a cornerstone of modern biology.
However, state-of-the-art models represent these entities in separate and incompatible embedding manifolds, limiting our ability to model the systemic biological processes that depend on their interaction.
We introduce ReactEmbed, a lightweight, plug-and-play enhancement module that bridges this gap.
Our key invention is a new paradigm that leverages biochemical reaction networks as a definitive source of functional semantics, as co-participation in reactions explicitly defines a functional role.
ReactEmbed takes existing, frozen embeddings from state-of-the-art models and aligns them in a unified space through a novel relational learning framework.
This framework interprets a weighted reaction graph using a specialized sampling strategy to distill functional relationships.
This process yields a cascade of benefits: (1) It enriches the unimodal embeddings, improving their performance on domain-specific tasks. (2) It achieves strong results on a diverse range of cross-domain benchmarks.
ReactEmbed provides a practical and powerful method to enhance and unify biological representations, effectively turning disconnected models into a more cohesive, functionally-aware system.
The code and database are available for open use.
Submission Type: Regular submission (no more than 12 pages of main content)
Assigned Action Editor: ~Yingce_Xia1
Submission Number: 6446
Loading