# counterfactual-policy-mean-embedding

Counterfactual Policy Mean Embeddings (CPME)

This repository provides the implementation of experiments for *Doubly Robust Estimation of Counterfactual Policy Mean Embeddings*. It is organized into three experimental tracks: **testing**, **sampling**, and **off-policy evaluation (OPE)**.

## Structure

Each folder contains:
- Python scripts implementing CPME estimators and baseline methods.
- Jupyter notebooks to launch experiments and visualize results.
- Supporting modules for environment setup, policies, and plotting.

---

### Testing

This track evaluates the ability of CPME-based tests to detect differences in counterfactual outcome distributions.

| File                     | Description                                                                 |
|--------------------------|-----------------------------------------------------------------------------|
| `environment.py`         | Simulates logging and target policies with 4 scenarios, as in  [2].           |
| `dr_kpt.py`              | Implements the doubly robust kernel policy test (DR-KPT).               |
| `kpt.py`                 | Adaptation of the kernel policy test (KPT) with permutation-based testing.  |
| `experiments.ipynb`      | Runs and visualizes hypothesis testing results.                             |
| `runtime_tables.py`      | Generates runtime comparison tables across methods.                         |

---

### Sampling

This track assesses the quality of counterfactual samples generated by kernel herding from plug-in and doubly robust CPME estimators.

| File                     | Description                                                                 |
|--------------------------|-----------------------------------------------------------------------------|
| `environment.py`         | Simulates logging/target policies and synthetic outcomes.                   |
| `embeddings.py`          | Contains plug-in and DR estimators of CPME.                                 |
| `experiment.py`          | Runs sampling experiments and saves results.                                |
| `analyze_results.py`     | Computes distance metrics (MMD, Wasserstein) between herded and true samples.|
| `plots.py`               | Visualizes the sampled outcome distributions.                               |

---

### Policy Evaluation

The `policy_evaluation` folder reproduces off-policy evaluation experiments adapted from [1]. Five experimental scenarios are considered, varying:

- Number of observations ($n$)
- Number of recommended items ($K$)
- Number of users ($N$)
- Context dimension ($d$)
- Policy similarity ($\alpha$)

Scripts are organized per scenario (e.g., `OPE_n_observation_experiments_100.py`) and generate CSV results under the `Results` folder. The notebook `Plot_Simulation_Results_Known_Propensities.ipynb` visualizes the outputs, and final plots appear in the `Figures` folder.

| File                     | Description                                                                 |
|--------------------------|-----------------------------------------------------------------------------|
| `Estimator_CPME.py`      | Implements IPS, Direct method (NN), DR-NN, CPME, and DR-CPME.               |
| `Environment.py`         | Defines the synthetic reward environment.                                   |
| `Policy.py`              | Encodes logging and target policy logic.                                    |
| `ParameterSelector.py`   | Cross-validation routine to select optimal hyperparameters [1].             |
| `visualization_utils.py` | Utilities for producing the evaluation plots.                               |

---

## References

[1] Krikamol Muandet, Motonobu Kanagawa, Sorawit Saengkyongam, and Sanparith Marukatat.  
*Counterfactual Mean Embeddings*. Journal of Machine Learning Research, 22(162):1–71, 2021.  
https://jmlr.org/papers/volume22/20-1296/20-1296.pdf

[2] Diego Martinez Taboada, Aaditya Ramdas, and Edward Kennedy.  
*An Efficient Doubly-Robust Test for the Kernel Treatment Effect*.  
Advances in Neural Information Processing Systems (NeurIPS), 36:59924–59952, 2023.  
