Improving Generalization in ML models via Causal Interaction Constraints

Improving Generalization in ML models via Causal Interaction Constraints

TMLR Paper5289 Authors

04 Jul 2025 (modified: 10 Oct 2025)Rejected by TMLREveryoneRevisionsBibTeXCC BY 4.0

Abstract: Machine learning models are effective in identifying patterns within independently and identically distributed (i.i.d.) data. However, this assumption rarely holds in real-world applications, where violations of i.i.d. can hinder both generalization and explainability. Causal Machine Learning is an emerging discipline that addresses these limitations by integrating causal reasoning, an element typically absent from conventional approaches. In this work, we introduce a novel causal machine learning strategy that emphasizes the role of spurious variable interactions, a concept grounded in the Independent Causal Mechanisms (ICM) principle. We argue that recognizing and constraining these spurious interactions is essential for improving model robustness and interpretability. To that end, we introduce a novel approach for incorporating interaction restrictions into neural network architectures and tree-based models such as random forest and gradient boosting. When applied to real-world scenarios, our method demonstrates that predictive models explicitly constrained to avoid spurious interactions exhibit enhanced generalization performance across diverse domains, outperforming their unconstrained counterparts.

Submission Length: Long submission (more than 12 pages of main content)

Changes Since Last Submission: Minor adaptation to the document taking into account the second round of comments from the reviewers.

Assigned Action Editor: ~Shahin_Jabbari1

Submission Number: 5289

Loading