Causal-Chemprop: Causal Machine Learning for Molecular Property Prediction and Optimization

Christian Natajaya; Lucas Attia; Jackson Burns

Causal-Chemprop: Causal Machine Learning for Molecular Property Prediction and Optimization

Christian Natajaya, Lucas Attia, Jackson Burns

Published: 20 Sept 2025, Last Modified: 05 Nov 2025AI4Mat-NeurIPS-2025 PosterEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Drug discovery, Causal modeling, Causal inferencing, Graph neural networks, Small molecules

Abstract: A priori estimation of molecular properties has long been of immense interest to the pharmaceutical sciences for hit generation and optimization. While neural network-based models have achieved high predictive accuracy, they still find limited utility in molecular design. High-dimensional molecular representations are difficult to optimize, particularly when trained on small or sparse datasets. Moreover, neural network-based models lack mechanisms to explicitly incorporate domain knowledge from experts and prior knowledge from existing data. Herein, we introduce a causal machine learning framework built on the Chemprop and DAGMA architectures for molecular property prediction called Causal-Chemprop. To our knowledge, this is the first application of causal machine learning to molecular property prediction and optimization. Via intervention-based inference, Causal-Chemprop demonstrates strong predictive performances on $IC_{50}$ from the Kinase Knowledgebase and aqueous $logS$ from a solubility dataset comprising BigSolDB and SolProp. Counterfactual inference offers support for human-in-the-loop optimization of molecular structure, which we demonstrate by predicting solubility on an quinolinyltriazole MIF inhibitor seed structure and its molecular derivatives. Finally, we integrate Causal-Chemprop with the molecular optimization algorithm EvoMol to perform inverse molecular design, yielding soluble analogs of the MIF inhibitor seed structure.

Submission Track: Paper Track (Full Paper)

Submission Category: AI-Guided Design

Submission Number: 15

Loading