Data-Driven Selection of Instrumental Variables for Additive Nonlinear, Constant Effects Models

Xichen Guo; Feng Xie; Yan Zeng; Hao Zhang; Zhi Geng

Data-Driven Selection of Instrumental Variables for Additive Nonlinear, Constant Effects Models

Xichen Guo, Feng Xie, Yan Zeng, Hao Zhang, Zhi Geng

Published: 01 May 2025, Last Modified: 08 Aug 2025ICML 2025 posterEveryoneRevisionsBibTeXCC BY 4.0

TL;DR: We first propose a novel testable condition that is both necessary and sufficient for identifying valid instrumental variable sets within additive non-linear, constant effects models under the milder assumption.

Abstract: We consider the problem of selecting instrumental variables from observational data, a fundamental challenge in causal inference. Existing methods mostly focus on additive linear, constant effects models, limiting their applicability in complex real-world scenarios. In this paper, we tackle a more general and challenging setting: the additive non-linear, constant effects model. We first propose a novel testable condition, termed the Cross Auxiliary-based independent Test (CAT) condition, for selecting the valid IV set. We show that this condition is both necessary and sufficient for identifying valid instrumental variable sets within such a model under milder assumptions. Building on this condition, we develop a practical algorithm for selecting the set of valid instrumental variables. Extensive experiments on both synthetic and two real-world datasets demonstrate the effectiveness and robustness of our proposed approach, highlighting its potential for broader applications in causal analysis.

Lay Summary: We consider the problem of selecting instrumental variables from observational data, a fundamental challenge in causal inference. Existing methods mostly focus on additive linear, constant effects models, limiting their applicability in complex real-world scenarios. In this paper, we tackle a more general and challenging setting: the additive non-linear, constant effects model. We first propose a novel testable condition, termed the Cross Auxiliary-based independent Test (CAT) condition, for selecting the valid IV set. We show that this condition is both necessary and sufficient for identifying valid instrumental variable sets within such a model under milder assumptions. Building on this condition, we develop a practical algorithm for selecting the set of valid instrumental variables. Extensive experiments on both synthetic and two real-world datasets demonstrate the effectiveness and robustness of our proposed approach, highlighting its potential for broader applications in causal analysis.

Primary Area: General Machine Learning->Causality

Keywords: Instrumental Variable; Testability; Causal Effect; Unmeasured Confounders; Causal Graphical Models

Submission Number: 4822

Loading