# CausalProfiler Experiments

This repository contains the code to reproduce all experiments from the paper "CausalProfiler: Generating Synthetic Benchmarks for Rigorous and Transparent Evaluation of Causal ML".

## Repository Structure

```
├── spaces/                    # Spaces of interest in .yml format
├── eval_scripts/              # Scripts for running experiments
├── copied_experiment_results/ # Directory for copying experiment results and visualizing them
├── setup.sh                   # Environment setup script
├── DCM/                       # DCM implementation
├── NCM/                       # NCM implementation
├── VACA/                      # VACA implementation
├── CausalNF/                  # Causal Normalizing Flows implementation
├── DeCaFlow/                  # DeCaFlow implementation
```

## 1. Environment Setup

The `setup.sh` script contains instructions for setting up each causal inference method used in our paper (VACA, NCM, DCM, CausalNF, DeCaFlow). This script creates a separate conda environment for each codebase and installs the causal-profiler in each.

```bash
# Run the setup script
bash setup.sh
```

**Notes:**

- The setup script is a convenience utility that combines the setup instructions from all the repositories used in our experiments (NCM, DCM, VACA, CausalNF, and DeCaFlow). Note that this script configures the various causal inference methods we evaluate. If you encounter setup issues, you can consult the original documentation in each project's README (e.g., DCM/README.md, NCM/README.md). Alternatively, you can also follow each project's individual setup instructions directly.
- The script installs causal-profiler with `cd ../casual-profiler && pip install -e .`
- Mac users may need additional setup steps for VACA. Please refer to `VACA/README.md`.

## 2. Running Experiments

Each causal inference method has an `evaluate.py` script that can be used to run experiments:

- `DCM/evaluation/evaluate.py`
- `NCM/evaluation/evaluate.py`
- `VACA/evaluation/evaluate.py`
- `CausalNF/evaluation/evaluate.py`
- `DeCaFlow/evaluation/evaluate.py`

### Example Usage

```bash
python DCM/evaluation/evaluate.py --config ./spaces/more_general_spaces.yml --output_dir copied_experiment_results/dcm_more_general_100 --num_runs 100 --num_tries 1
```

### Running the Experiments of the Paper

#### Experiment 1: General Spaces

```bash
nohup bash eval_scripts/run_general.sh > logs/general.log 2>&1 &
```

#### Experiment 2: Discrete Spaces

```bash
nohup bash eval_scripts/run_discrete_random_ctfte.sh > logs/discrete_random_ctfte.log 2>&1 &
nohup bash eval_scripts/run_discrete_ctfte.sh > logs/discrete_ctfte.log 2>&1 &
```

#### Experiment 3: Hidden Confounders

```bash
nohup bash eval_scripts/run_hidden_confounders.sh > logs/hidden_confounders.log 2>&1 &
```

**Note:** For smaller experiments, you can create your own space definition files in the `spaces/` directory.

## 3. Visualizing Results

After running experiments, results are saved in the `evaluation/results` directories of each respective method. You can locate these with:

```bash
find . -type d -path "*/evaluation/results*"
```

Example output:

```
./CausalNF/evaluation/results/cnf_more_general_100
./DCM/evaluation/results/dcm_more_general_100
./NCM/evaluation/results/ncm_more_general_100
./VACA/evaluation/results/vaca_more_general_100
```

### Generating Paper Figures and Tables

Use the `summarize_results.py` script in the `copied_experiment_results/` directory:

#### Figure 1 (Top) & Table 1 (Top Half)

```bash
python summarize_results.py \
    ../DCM/evaluation/results/dcm_general_100 \
    ../NCM/evaluation/results/ncm_general_100 \
    ../VACA/evaluation/results/vaca_general_100 \
    ../CausalNF/evaluation/results/cnf_general_100 \
    --output_dir results/expe1 \
    --SoIs general3 general7 \
    --space_names "general3:Linear-Medium" "general7:NN-Medium"
```

#### Figure 1 (Bottom) & Table 1 (Bottom Half)

```bash
python summarize_results.py \
    ../DCM/evaluation/results/dcm_general_100 \
    ../NCM/evaluation/results/ncm_general_100 \
    ../VACA/evaluation/results/vaca_general_100 \
    ../CausalNF/evaluation/results/cnf_general_100 \
    --output_dir results/expe2 \
    --SoIs general8 general8_50data \
    --space_names "general8:NN-Large" "general8_50data:NN-Large-LowData" "general8_100data:DROP"
```

#### Table 2

```bash
python summarize_results.py \
    ../DCM/evaluation/results/dcm_discrete_random_100 \
    ../CausalNF/evaluation/results/cnf_discrete_random_100 \
    --output_dir results/discrete_spaces_random
```

#### Figure 8, Table 8, Table 11

```bash
python summarize_results.py \
  ../DCM/evaluation/results/dcm_general_100 \
  ../CausalNF/evaluation/results/cnf_general_100 \
  ../DCM/evaluation/results/dcm_hidden_100 \
  ../CausalNF/evaluation/results/cnf_hidden_100 \
  ../DeCaFlow/evaluation/results/decaflow_hidden1 \
  ../DeCaFlow/evaluation/results/decaflow_hidden2 \
  --SoIs linear_60_hidden general2 \
  --output_dir results/expe_hidden \
  --space_names "general2:Linear-No-Hidden" "linear_60_hidden:Linear-60-Hidden"
```
