Capturing substructure interactions by invariant Information Bottle Theory for Generalizable Property Prediction
Keywords: Molecular Merged Graph.+ Drug-drug interaction.+ Out-of-Distribution
TL;DR: We propose Interaction-aware Invariant Molecular Learning model, which improves molecular property predictions by modeling atomic interactions and using a graph bottleneck to capture key substructures.
Abstract: Molecular interactions are a common phenomenon in physical chemistry, often resulting in unexpected biochemical properties harmful to humans, such as drug-drug interactions. Machine learning has shown great potential for predicting these interactions rapidly and accurately. However, the complexity of molecular structures and the diversity of interactions often reduce prediction accuracy and hinder generalizability. Identifying core invariant substructures (i.e., rationales) has become essential to improving the model's interpretability and generalization. Despite significant progress, existing models frequently overlook the pairwise molecular interaction, leading to insufficient capture of interaction dynamics. To address these limitations, we propose I2Mole (Interaction-aware Invariant Molecular learning), a novel framework for generalizable property prediction. I2Mole meticulously models atomic interactions, such as hydrogen bonds and Van der Waals forces, by first establishing indiscriminate connections between intermolecular atoms, which are then refined using an improved graph information bottleneck theory tailored for merged graphs. To further enhance model generalization, we construct an environment codebook by environment subgraph of the merged graph. This approach not only could provide noise source for optimizing mutual information but also preserve the integrity of chemical semantic information. By comprehensively leveraging the information inherent in the merged graph, our model accurately captures core substructures and significantly enhances generalization capabilities. Extensive experimental validation demonstrates I2Mole's efficacy and generalizability. The implementation code is available at https://anonymous.4open/r/I2Mol-C616.
Supplementary Material: zip
Primary Area: applications to physical sciences (physics, chemistry, biology, etc.)
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 2600
Loading