# Reproducting the numerical experiments in our study.

### Preparation

#### Python environment
One may need to make the virtual environment and install all the requirements using Anaconda Distribution.
```
conda env create --name nbwenv python=3.9 -f requirements.yaml
source activate nbwenv
```

#### R environment
The ```WeightIt``` library in R is used to calculate weights of entropy balancing.
An installation of this library is required to conduct the experiments related to Section 8.
For this purpose, execute the following command in R console:

```
> install.packages("WeightIt")
```

---

### Run Experiments

#### Experiments related to Section 5 (decribed in Section D.1)
##### Run
Execute ```Run_Experiments_in_Section_5.py``` script, as follows:

```
python Run_Experiments_in_Section_5.py
```

##### Output
The results are created into the ```out/``` directory created in the current directory.
The details of the output are as follows.

```
[./out]/
├── Res_Section_5/[YYYYMMDD_HHMM_SS]/
        ├── logs/
        └── Ndata05000_dimension05_alphaXXXXXX.csv
```
##### The details of output
* [YYYYMMDD_HHMM_SS]/
  - A folder with a name of the date and time when the script was executed is created.
* logs/
  - The logs of pytorch for each simulation.
* Ndata05000_dimension05_alphaXXXXXX.csv
  - The results of the experiments for each α (α=-3, -2, -1, 0.8, 0.5, 0.2, 2 , 3, and 4) are outputed in a csv file. Here, ```XXXXX``` is the value of α. In this file, each column corresponds to the results of each simulation, and the "index" column corresponds to the number of steps.

#### Experiments related to Section 7 (decribed in Section D.2)
##### Run
Execute ```Run_Experiments_in_Section_7.py``` script, as follows:

```
python Run_Experiments_in_Section_7.py
```

##### Output
The results are created into the ```out/``` directory created in the current directory.
The details of the output are as follows.

```
[./out]/
├── Res_Section_7/[YYYYMMDD_HHMM_SS]/
        ├── logs/
        └── Ndata05000_dimensionXX_alpha0.500.csv
```
##### The details of output
* [YYYYMMDD_HHMM_SS]/
  - A folder with a name of the script execution date and time is created.
* logs/
  - The logs of pytorch for each simulation.
* Ndata05000_dimensionXX_alpha0.500.csv
  - The results of the experiments for each dimension d of data (d=2, 3, 4, 5, 6, and 7) are outputed in a csv file. Here, ```XX``` is the dimension d of the data. In this file, each column corresponds to the results of each simulation, and the "index" column corresponds to the number of steps.


#### Experiments related to Section 7 (decribed in Section D.3)
##### Preparation: Generating train and test data
Before conducting these experiments, training and test data must be created.

First, execute ```Generate_train_data_in_Section_8.py``` and ```Generate_test_data_in_Section_8.py``` scripts, as follows:
```
python Generate_train_data_in_Section_8.py
python Generate_test_data_in_Section_8.py
```

To conduct the experiments, the training data need to include entropy balancing weights.
In the R console, change the working directory to the directory in which ```Generate_train_data_in_Section_8.py``` was executed, then excute ```Calculation_Entropy_Balancing_weights_in_Section_8.R``` script, as follows:
```
> setwd("<the  path of the directory in which Generate_train_data_in_Section_8.py was exected>")
> source("Calculation_Entropy_Balancing_weights_in_Section_8.R")
```

##### Run
Execute ```Run_Experiments_in_Section_8_N1000.py``` , ```Run_Experiments_in_Section_8_N10000.py```, and ```Run_Experiments_in_Section_8_N10000.py``` scripts, as follows:
```
python Run_Experiments_in_Section_8_N1000.py
python Run_Experiments_in_Section_8_N10000.py
python Run_Experiments_in_Section_8_N100000.py
```

##### Output
The results are created into the ```out/``` directory created in the current directory.
The details of the output are as follows.

```
[./out]/
├── Res_Section_8/[data_size_in_the_experiments]/[the_experiment_name]_[YYYYMMDD_HHMM_SS]/
        ├── logs/
        └── [the_xperiment_name]_[YYYYMMDD_HHMM_SS].csv
```
##### The details of output
* [data_size_in_the_experiments]/
  - A folder with a name of the size of the data used in the experiments is created. This value is one of the following three strings:
      - ```N1000```
      - ```N10000```
      - ```N100000```
* [the_experiment_name]_[YYYYMMDD_HHMM_SS]/
  - A folder with a name of the experiments and the date and time when the script was executed is created. This value of [the_experiment_name] is one of the following two strings:
      - ```Experiment1```
      - ```Experiment2```
* logs/
  - The logs of pytorch for each simulation.
* [the_xperiment_name]_[YYYYMMDD_HHMM_SS].csv
  - The results of the experiments are outputed in a csv file. The file name is the same as the name of the directory in which the file is in.
    In this file, each row corresponds to the result of each simulation, where the "index" column indicates the ids of the simulations. Each column of the file corresponds to the result of each combination of balancing methods (NBW, EB, and NoWeights) and supervised learning algorithms (LR and GBT) used in the experiments.

