Testing Conditional Mean Independence Using Generative Neural Networks

Published: 01 May 2025, Last Modified: 18 Jun 2025ICML 2025 posterEveryoneRevisionsBibTeXCC BY 4.0
TL;DR: We introduce a novel method for conditional mean independence testing using deep generative neural networks with strong performance in high-dimensional setting.
Abstract: Conditional mean independence (CMI) testing is crucial for statistical tasks including model determination and variable importance evaluation. In this work, we introduce a novel population CMI measure and a bootstrap-based testing procedure that utilizes deep generative neural networks to estimate the conditional mean functions involved in the population measure. The test statistic is thoughtfully constructed to ensure that even slowly decaying nonparametric estimation errors do not affect the asymptotic accuracy of the test. Our approach demonstrates strong empirical performance in scenarios with high-dimensional covariates and response variable, can handle multivariate responses, and maintains nontrivial power against local alternatives outside an $n^{-1/2}$ neighborhood of the null hypothesis. We also use numerical simulations and real-world imaging data applications to highlight the efficacy and versatility of our testing procedure.
Lay Summary: Conditional mean independence (CMI) testing is a fundamental tool for model simplification and assessing variable importance. However, existing test procedures suffer from severe performance deterioration in high dimensional setting. We propose a new test procedure, basing on a novel CMI measure and neural networks, that has strong empirical performance in scenarios with high-dimensional covariates and response variable. Our test can help in improving model efficiency, accuracy, and interpretability for many machine learning applications.
Link To Code: https://github.com/LinjunHuang86749/Testing-CMI-Using-Generative-NN
Primary Area: General Machine Learning
Keywords: Conditional Distribution, Maximum Mean Discrepancy, Kernel Method, Double Robustness
Submission Number: 2944
Loading