# Requirements
All of requirements are included in requirements.txt and can be installed as
```bash
pip install -r requirements.txt
```

We use Python 3.6.6 on Windows 10 environment.
 
------------------------------------
# Create Dataset
By the following commands, the three toy datasets are created.  
```bash
python gen_toy_mix3.py
python gen_toy_ramp3.py
python gen_toy_norm3.py
```
  
Then, the following dataset file are created. The details of each dataset is described in Appendix D.  

| Name         | Dataset      |
|:-------------|:-------------|
| mix3_pn.csv  | Mix dataset  |
| ramp3_pn.csv | Ramp dataset |
| norm3_pn.csv | Norm dataset |
  
In these files, 17th column show the data generation probability. These datasets are attached in this zip.  
  
------------------------------------
# Run VAE 
Execution code is "vae_v30.py". Run VAE as follows.  
```bash
python vae_v30.py --dataset "dataset file" --loss2 "coding loss"   --lambda2 "labmda Value"
options:
  --dataset: setect dataset
    "mix3_pn": Mix dataset  / "ramp3_pn": Ramp dataset / "norm3_pn":Norm dataset
  --loss2: select coding loss types
    "mse": square error coding loss / "sl1": downward-convex loss / "sl2":upward-convex loss
  --lambda2 : labmda value, which is corresponding to 16/beta in the conventional beta-VAE 
  --batch-size : batch size(default 128)
  --epocs: Number of epochs to trin(default 500)
  --seed: randam seed(default 1)
  --no-cuda: use cuda (default use cuda)
```
In this code, there remain some options which are not effective anymore.  
  
Then, the result will be created in the directory "results/VAE30_aa_bb_cc_dd".  
The meaning of each field is as below.

- aa: Dataset. "mix3_pn": Mix dataset  / "ramp3_pn": Ramp dataset / "norm3_pn":Norm dataset  
- bb: Fixed as "3000.0000" and have no meaning in this version   
- cc: &lambda; for trining. This is corresponding to 16/&beta; in the conventional &beta;-VAE  
- dd: Fixed as "mse" and have no meaning in this version  
- ee: Coding loss. "mse": square error coding loss / "sl1": downward-convex loss / "sl2":upward-conv   
  
For example, the condition for "VAE30_mix3_pn_3000.0000_100.0000_mse_sl1" directory is as follows: 
- Dataset: Mix dataset
- &lambda;: 100
- Coding loss: downward-convex loss
  
In the directory, the following statictics csv files and plot graphs are generated.

- stats_xx_yy.csv: Statistics file. xx is dataset, yy is epoch number. xx and yy in graphs below are the same.  
- z_plot_xx_yy_P(x)\_vs\_(sigma_P(mu)).png: Plots of p(x) and (sigma P(mu))  
- z_plot_xx_yy_P(x)\_vs\_(sqrt(A)_exp(ELBO)): Plots of p(x) and (sqrt(det(Gx)) exp(ELBO))  
- z_plot_xx_yy_P(x)\_vs\_(sqrt(A)_sigma_P(mu)): Plots of p(x) and (sqrt(Gx) sigma P(mu))  
- z_plot_xx_yy_P(x)\_vs\_exp(ELBO): Plots of p(x) and (exp(ELBO))  
- z_plot_xx_yy_P(x)\_vs\_P(mu): Plots of p(x) and the prior P(mu)  
  
In "stats_xx_yy.csv" file, the meaning of each row is as follows.  
- row 1:  Experimental conditions. Version, dataset, N/A, coding loss, N/A, lambda  
- row 2:  the dimension of the latent variable  
- row 3:  The averge of D'(z_j). D'(z_j) is a partial derivertive of dx_/dzj  
- row 4:  The standard deviation of D'(z_j)  
- row 5:  The average of the stimated Norm   
- row 6:  The standard deviation of the stimated Norm   
- row 7:  The average of {\sigma_j}^{-2}  
- row 8:  The standard deviation of {\sigma}_j^{-2}  
- row 9:  The normarized average of {\sigma_j}^{-2} by the minimum value of 7 among 3 dimensional components. 
- row 10: The normarized deviation of {\sigma}_j^{-2}  by the minimum value of 7 among 3 dimensional components. 
- row 11: Have no meaning in this version. 
- row 12: Have no meaning in this version. 
- row 13: Correration coefficients between p(x) and estimated probabilities. The order is P(mu), (sqrt(Gx) sigma P(mu)), (sigma P(mu)),  (exp(ELBO)), and (sqrt(det(Gx)) exp(ELBO))  
- row 14: Loss, KL Divergence, Transform Loss, Coding Loss, Reconstruction Loss

As a sample, the result of "VAE30_mix3_pn_3000.0000_100.0000_mse_sl1" is attached in this zip.  
  
------------------------------------
# Run VAE for all combination of 3 datasets, 3 coding losses, and 10 &lambda; parameters

Run the the following batch file to execute all 90 combinations in Windows environment.
```bash
RunSimulation.bat
```

Then, the results for all 90 combinations are generated in "results" directory.  
  
By using "MakeCSV.py", all of the 90 results at 500th epoch are summarized in the "CollectResult.csv" file. Before use, the list of the directries to summarize should be added to "PathList.csv" file.

```bash
python MakeCSV.py
```
The explanation of "CollectResult.csv" is as follows.

- column 1: Version(Fixed)  
- column 2: Dataset name  
- column 3: Have no meaning in this version  
- column 4: Coding loss type  
- column 5: Have no meaning in this version  
- column 6: &lambda; value  
- column 7-9: The averge of D'(z_j) for z1, z2, and z3.  
- column 10-12: The standard deviation of D'(z_j)  
- column 13-15: The average of the stimated Norm  
- column 16-18: The standard deviation of the stimated Norm  
- column 19-21: The average of {\sigma_j}^{-2}  
- column 22-24: The standard deviation of {\sigma}_j^{-2}  
- column 25-27: The normarized average of {\sigma_j}^{-2} by the minimum value of 7 among 3 dimensional components. 
- column 28-30: The normarized deviation of {\sigma}_j^{-2}  by the minimum value of 7 among 3 dimensional components.  
- column 31-36: Have no meaning in this version.
- column 37: Correration coefficients between p(x) and P(mu))  
- column 38: Correration coefficients between p(x) and (sqrt(Gx) sigma P(mu))  
- column 39: Correration coefficients between p(x) and (sigma P(mu))  
- column 40: Correration coefficients between p(x) and (exp(ELBO)) 
- column 41: Correration coefficients between p(x) and  (sqrt(det(Gx)) exp(ELBO))
- column 42: Loss
- column 43: KL Divergence
- column 44: Transform Loss
- column 45: Coding Loss
- column 46: Reconstruction Loss
