# Doubly-Robust-Kernel-Proxy-Variable-Algorithm
This repository provides code for a doubly robust algorithm for Proxy Causal Learning (PCL) using kernel methods, avoiding density ratio estimation. All code is written in Python 3 using the JAX library for accelerated computation with GPUs. However, all experiments can also be run using the CPU version of JAX. In our paper, we consider the Proxy Causal Learning (PCL) setting and introduce a doubly robust estimation procedure for the dose-response curve.

 The following figure illustrates a representative Directed Acyclic Graph (DAG) for the PCL setting. In this graph: the yellow circles denote the observed variables: $A$ denotes the treatment, $Y$ denotes the outcome, $Z$ denotes the treatment proxy, and $W$ denotes the outcome proxy. The white circle denotes the unobserved confounding variable $U$. Bi-directional dotted arrows represent possible causal directions or shared ancestors.

![](./pcl_dag.png) 

All simulation codes for the manuscript are organized under the folder Simulations. Each subfolder is named according to the experiment it contains. For example: "Simularions/ATE_Simulations/DeanerExperiment" contains the grade retention experiments (Figure 2d in the paper). The jupyter notebooks inside the folder "Simulations/AnalyzeSimulationResults" includes the corresponding analysis and plot. 

## Example: Reproducing Figure 2(a)
To reproduce Figure 2(a) (Synthetic Low Dimensional Setting):

* Run the following scripts in Simulations/ATE_Simulations/SyntheticLowDim (alternatively, use a bash script to run them on a cluster.):

    ``` python DoublyRobustKPV_SyntheticDataExperiment.py```

    ``` python DoublyRobustPMMR_SyntheticDataExperiment.py```

    ``` python AlternativeProxyKernel_SyntheticDataExperiment.py```

    ``` python KernelNegativeControl_SyntheticDataExperiment.py```

    ``` python KPV_SyntheticDataExperiment.py```

    ``` python PKDR_SyntheticDataExperiment.py```

    ``` python PKIPW_SyntheticDataExperiment.py```

    ``` python PMMR_SyntheticDataExperiment.py```

* These scripts will generate the required .pkl files under Simulations/Results.

* Visualize the results by running the codes in the jupyter notebook: "Simulations/AnalyzeSimulationResults/SyntheticLowDim_Experiment.ipynb".

# Python Version and Dependencies

* Python Version : Python 3.8.18

* pip version : 24.0

* Required Python Packages : Specified in requirements.txt file.

## Script Contents in src folder
This file is the source code for each causal learning algorithm and the utility scripts we used in the paper. The following is the full list.

Python Script         |  Explanation
:--------------------:|:-------------------------:
causal_models/proxy_causal_learning.py            | Implements the following proxy causal learning algorithms: Kernel Alternative Proxy ATE [1], Kernel Proxy Variable [3], PMMR [3], Kernel Negative Control ATE [2]
causal_learning.py            | Implementation of Kernel-ATE [5]
utils/kernel_utils.py  | 	Kernel classes such as Gaussian and Linear kernels
utils/linalg_utils.py | Linear algebra utilities, including pairwise distance computation
utils/ml_utils.py | Data normalization utilities
utils/visualization_utils.py | Visualization utilities
other_methods | Must contain implementations of PKDR [6] algorithms
generate_experiment_data.py | Contains data-generating functions used in the experiments


! For data and src/other_methods, please see: https://github.com/BariscanBozkurt/Doubly-Robust-Kernel-Proxy-Variable-Algorithm
# References
[1] Bariscan Bozkurt, Ben Deaner, Dimitri Meunier, Liyuan Xu, and Arthur Gretton. Density ratio-based proxy causal learning without density ratios. In The 28th International Conference on Artificial Intelligence and Statistics, 2025.

[2] Singh, R. (2023). Kernel methods for unobserved confounding: Negative controls, proxies, and instruments.

[3] Mastouri, A., Zhu, Y., Gultchin, L., Korba, A., Silva, R., Kusner, M. J., Gretton, A., and Muandet, K. (2021). Proximal causal learning with kernels: Two-stage estimation and moment restriction. In International Conference on Machine Learning.

[4] Wu, Y., Fu, Y., Wang, S., and Sun, X. (2024). Doubly robust proximal causal learning for continuous treatments. In International Conference on Learning Representations.

[5] Singh, R., Xu, L., and Gretton, A. (2023). Kernel methods for causal functions: dose, heterogeneous and incremental response curves. Biometrika, 111(2):497–516

[6] https://github.com/yezichu/PCL_Continuous_Treatment

[7] https://github.com/liyuan9988/DeepFeatureProxyVariable
