Improving Generalization in ML models via Causal Interaction Constraints

TMLR Paper5289 Authors

04 Jul 2025 (modified: 18 Jul 2025)Under review for TMLREveryoneRevisionsBibTeXCC BY 4.0
Abstract: Machine learning models are effective in identifying patterns within independently and identically distributed (i.i.d.) data. However, this assumption rarely holds in real-world applications, where violations of i.i.d. can hinder both generalization and explainability. Causal Machine Learning is an emerging discipline that addresses these limitations by integrating causal reasoning, an element typically absent from conventional approaches. In this work, we introduce a novel causal machine learning strategy that emphasizes the role of spurious variable interactions, a concept grounded in the Independent Causal Mechanisms (ICM) principle. We argue that recognizing and constraining these spurious interactions is essential for improving model robustness and interpretability. To that end, we introduce a novel approach for incorporating interaction restrictions into neural network architectures and tree-based models. When applied to real-world scenarios, our method demonstrates that predictive models explicitly constrained to avoid spurious interactions exhibit enhanced generalization performance across diverse domains, outperforming their unconstrained counterparts.
Submission Length: Long submission (more than 12 pages of main content)
Assigned Action Editor: ~Shahin_Jabbari1
Submission Number: 5289
Loading