Causal Importance for Physics-Informed Machine Learning

Daniel Fiifi Tawia Hagan; Thomas Mortier; Cas Decancq; Diego G. Miralles

Causal Importance for Physics-Informed Machine Learning

Daniel Fiifi Tawia Hagan, Thomas Mortier, Cas Decancq, Diego G. Miralles

Published: 10 Mar 2026, Last Modified: 07 Apr 2026CLeaR 2026 PosterEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Liang-Kleeman information flow causality, causally-informed ML, feature selection, interpretable machine learning

TL;DR: We propose a framework that guides machine learning toward the right answers for the right reasons by embedding causal knowledge into the learning process.

Abstract: Predictive modelling in complex dynamical systems often relies on machine learning (ML) models trained on correlated and partially redundant predictors. Standard feature importance measures are usually correlation-based and model-specific, providing limited guidance for disentangling mediators, confounders, and true drivers, and generally offering no principled route for causally motivated prediction. Here we bridge these by explicitly importing causal information, derived from the multivariate Liang-Kleeman information flow (LKIF) framework, into ML models. Here, we decompose the differential information flow into four conditioner-importance indices (Mediator Dominance Index, Moderation Gain, Confounding Pressure, and Causal Sufficiency Rate), then we construct a Causal Importance Score (CIS) that summarises the relevance of each conditioner to a given coupling. Finally, we use this CIS as a prior for two complementary ML strategies: (i) a baseline Random Forest (RF), and (ii) a neural network (NN) whose input-layer attention weights are regularised towards CIS-based priors. Using a real-world testbed with four interacting eco-hydrological variables and a target, we show that CIS-regularised NNs can closely align their learned feature usage with the physically motivated causal ranking, while retaining competitive predictive skill. This provides a concrete example of causally informed prediction, where causal diagnostics do not merely interpret an already-trained black box, but actively shape the hypothesis space explored by the model and offer a principled handle on feature selection and dimensionality reduction.

Pmlr Agreement: pdf

Submission Number: 32

Loading