# Code
The `benchmark` module allows to benchmark different rdd-settings
using data generated by the DGP described in the paper.

## Structure:
- benchmark: executable module for benchmarking different scenarios
  - \_\_main\_\_.py: main program entry point
  - benchmark.py: contains the estimator calls and manages the data collection:
    - wrapper for the DGP call
    - benchmark methods (dml, rdrobust, oracles) fitting the estimators
    - benchmark loop
  - methods.py: default ml/dml learner configuration
  - settings.py: default rdd scenarios
  - setting_\*.json: used rdd scenarios 
  - tuned_dgp_cfg_\*.json: used DGP configs (different noise levels)
- run.sh: run script 
- requirements.txt: pip requirements file


## Result Format:
Translation of rdd\_result.csv columns to the notation used in paper:

- method 
  - rdrobustnocovs: "RDD Without Covs"
  - rdrobust: "RDD Conventional Covs"
  - rdflex_local_linear: "RDFlex Lasso"
  - rdflex_stacking_lgbm: "RDFlex Stacking"
- setting
  - general format: {score}\_{effect}\_{nonevertakers/nevertakers}[\_{subrule}]
  - score: 
    - distance: $G = I_D$
    - yieldimprovement: $G = I_Y$
  - effect: 
    - fuzzy: "Fuzzy"
    - intent to treat: "Sharp"
  - nonevertakers/nevertakers: acknowledging operator ($T = D$) / cautious operator ($I_D \land I_Y = T \neq D = I_D \land I_Y \land I_E$)
  - subrule: 
    - if present estimation of the subset complier effect of $G$ dropping the nevertaker $I_Y \leq 0$ / $I_D \leq 0$
    - Translates to "Subset"
