
# Code for "Causal discovery in the presence of endogenous context variables"
Note: this code is for reviewing purposes only. An official version will be released upon acceptance.

## Description

This is the code for the simulation study of the paper, including all algorithms that were used in the study: PC-AR, PC-B, PC-P, PC-M, JCI-FCI and CD-NOD. For further information and references, please consult the main paper.

## Getting started

### Dependencies 

To install the dependencies required for this setup, use the ```requirements.txt``` file as such:

```pip install -r requirements.txt```

## Running the simulation study

### Getting experimental results
There are two main files to be used for running the simulation study, which can be found in the folder ```endogeneous_context_simulation_study```:

1. runs the experiments for M-PCMCI, P-PCMCI, B-PCMCI, PAC-PCMCI, SAC-PCMCI - ```compute_regime_graphs_timeseries.py```
2. runs the experiments for R-PCMCI - ```compute_regime_graphs_timeseries_naive.py```

All other code files necessary for the experimental results can also be found there. 

We use YAML files to store the experimental setup configurations. Make sure you change the ```save_folder``` according to your needs. The YAML files that have been used to generate the results in the paper can be found in the folder  ```update_configs```. 

As described in the main paper, we ran our experiments on a cluster. To run these experiments using slurm, run the following scripts as follows:
1. ```python submit_tasks.py PATH_TO_YAML`` for starting jobs with the experiments for M-PCMCI, P-PCMCI, B-PCMCI, PAC-PCMCI, SAC-PCMCI.
2. ```python submit_tasks_naive.py PATH_TO_YAML``` for starting jobs with the experiments for R-PCMCI.

All Yaml configuration files used for generating results can be found in the ```timeseries_final_configs``` folder. 

Before starting, make sure you change the account name and other parameters in the corresponding submit scripts! Also, make sure ```sequential = False``` is set in ```compute_regime_graphs_timeseries.py```and ```compute_regime_graphs_timeseries_naive.py```. 

If you would like to run the experiments in sequential mode, there are two things to take care of: 
1. set```sequential = True``
2. Uncomment the main part in ```compute_regime_graphs_timeseries.py```and ```compute_regime_graphs_timeseries_naive.py```
3. Run the scripts as follows :
	*  ```python compute_regime_graphs_timeseries.py PATH_TO_YAML```
	* ```python compute_regime_graphs_timeseries_naive.py PATH_TO_YAML```
 
### Obtaining the metrics for the experiments
 The results of the experiments are evaluated using the ```metrics.py``` script. To generate the results, run the following commands:
 1. ```python metrics.py PATH_TO_YAML FOLDER_TO_SAVE_METRICS```
where ```FOLDER_TO_SAVE_METRICS``` is the folder where the metric files should be saved. 

### Obtaining the plots 
The plots are generated using the ```plot.py``` script. To generate plots for a set of configurations, use the following command:
1. ``` python plot.py --yaml_path PATH_TO_YAML --metrics_folder PATH_TO_METRICS_FOLDER --plot_known KNOWN_TYPE --plot_avg PLOT_AVG --metrics_list METRICS_LIST``` 
where the following arguments must be set (besides the already mentioned ```--yaml_path``` and ```--metrics_folder```)
```--plot_avg``` sets whether you would like to plot the metric values averaged over contexts, or for all contexts individually. For the plots in the paper, this is set to ```True```
```--metrics_list``` sets which metrics should be plotted. In the paper we use ```tpr,fpr``` for the TPR and FPR
```--plot_known``` a list of configs to plot that indicate if the regime is known, i.e., whether all links should be evaluated for the metrics (False), whether the links from regime to its children should be ignored (True)
        or whether the links to and from the regime indicator should be ignored ('and_parents'). Example: ```--plot_known and_parents,True,False``` or ```--plot_known True,False``` 

## Further details

We also submit our metric files, which can be found in the ```results``` folder. Not all experiment trials are succesful. We save, for each configuration, how many trials have failed  in the ```failed.txt``` file in the ```results``` folder. 

## License
Since we are building on Tigramite, we use the GNU General Public License as published by the Free Software Foundation; version 3 of the License or later.
