(Conditional) independence tests

Fisher-z

Perform an independence test using Fisher-z’s test [fisherz]. This test is optimal for linear-Gaussian data.

Parameters test

data: data matrices.

X, Y and condition_set: data matrices of size number_of_samples * dimensionality.

correlation_matrix: correlation matrix; None means without the parameter of correlation matrix.

Returns

p: the p-value of the test.

fisherz

Fisher, Ronald A. “On the’probable error’of a coefficient of correlation deduced from a small sample.” Metron 1 (1921): 1-32.

Missing-value Fisher-z test

Perform a testwise-deletion Fisher-z independence test to data sets with missing values. With testwise-deletion, the test makes use of all data points that do not have missing values for the variables involved in the test.

Parameters

mvdata: data with missing values.

X, Y and condition_set: data matrices of size number_of_samples * dimensionality.

Returns

p: the p-value of the test.

Chi-Square test

Perform an independence test on discrete variables using Chi-Square test.

Parameters

data: data matrices.

X, Y and condition_set: data matrices of size number_of_samples * dimensionality.

G_sq: True means using G-Square test;

False means using Chi-Square test.

Returns

p: the p-value of the test.

Kernel-based conditional independence (KCI) test and independence test

Kernel-based conditional independence (KCI) test and independence test. To test if x and y are conditionally or unconditionally independent on Z. For unconditional independence tests, Z is set to the empty set.

Parameters

X, Y and Z: data matrices of size number_of_samples * dimensionality. Z could be None.

KernelX/Y/Z: [‘GaussianKernel’, ‘LinearKernel’, ‘PolynomialKernel’]. (For ‘PolynomialKernel’, the default degree is 2. Currently, users can change it by setting the ‘degree’ of ‘class PolynomialKernel()’.

Returns

p_val: the p value.

KCI

Zhang, Kun, et al. “Kernel-based conditional independence test and application in causal discovery.” UAI. 2011.

G-Square test

Perform an independence test using G-Square test [gSquare]. This test is based on the log likelihood ratio test.

Parameters

data: data matrices.

X, Y and condition_set: data matrices of size number_of_samples * dimensionality.

G_sq: True means using G-Square test; False means using Chi-Square test.

Returns

p: the p-value of the test

gSquare

Tsamardinos, Ioannis, Laura E. Brown, and Constantin F. Aliferis. “The max-min hill-climbing Bayesian network structure learning algorithm.” Machine learning 65.1 (2006): 31-78.