A Sample Efficient Conditional Independence Test in the Presence of Discretization

Published: 01 May 2025, Last Modified: 18 Jun 2025ICML 2025 posterEveryoneRevisionsBibTeXCC BY 4.0
TL;DR: A sample-efficient CI test for variables that are inherently continuous but only discretized observations are available.
Abstract: Conditional independence (CI) test is a fundamental concept in statistics. In many real-world scenarios, some variables may be difficult to measure accurately, often leading to data being represented as discretized values. Applying CI tests directly to discretized data, however, can lead to incorrect conclusions about the independence of latent variables. To address this, recent advancements have sought to infer the correct CI relationship between the latent variables by binarizing the observed data. However, this process results in a loss of information, which degrades the test's performance, particularly with small sample sizes. Motivated by this, this paper introduces a new sample-efficient CI test that does not rely on the binarization process. We find that the relationship can be established by addressing an \textit{over-identifying} restriction problem with \textit{Generalized Method of Moments} (GMM). Based on this finding, we have designed a new test statistic, and its asymptotic distribution has been derived. Empirical results across various datasets show that our method consistently outperforms existing ones.
Lay Summary: We propose DCT-GMM, a sample-efficient conditional independence (CI) test tailored for scenarios where inherently continuous variables are discretized due to measurement limitations. Unlike the original DCT method, which estimates the CI-related parameter by solving a single equation despite the availability of multiple moment conditions, DCT-GMM addresses this overidentification problem using the Generalized Method of Moments (GMM). This allows for more efficient estimation and valid statistical inference. We demonstrate both theoretically and empirically that DCT-GMM outperforms DCT in terms of accuracy and robustness.
Link To Code: https://github.com/boyangaaaaa/DCT
Primary Area: General Machine Learning->Causality
Keywords: Conditional independence test, discretization, Generalized method of Moments
Submission Number: 11752
Loading