# README #
LDP co-design.

### Setup ###

* export `LDP_CODESIGN_DIR` in your bashrc to be the path of this repo
    * example `export LDP_CODESIGN_DIR=~/Desktop/ldp_codesign/`

* have `python3` installed. My version is 3.7.4

* have other python packages installed, including `torch`, `numpy`, `pandas`, etc


### Models ###

There are 4 subdirectories for evaluation under `./codesign/`:

* `eval_theoretical` computes some theoretical results under Assumption 1;

* `eval_household`, `eval_valuation` and `eval_cancer` correspond to the hourly household power consumption, real estate valuation, breast cancer detection applications, respectively.

The parameters should be specified in the `get_model_specific_info.sh` first.

### Approaches ###

There are 3 approaches: `codesign`, `sep_design`, and `benchmark`, corresponding to `task-aware`, `privacy-agnostic` and `task-agnostic` approaches mentioned in our paper.  

### Theoretical Result ###
```
cd ./codesign/eval_theoretical
./laplace_theoretical_loss.sh // Compute the theoretical losses
./laplace_loss_privacy_plot.sh // Plot loss under different privacy budgets
```

### Simulation ###

* [linear model and l2-loss] For `household` application:
```
cd ./codesign/eval_household
./household_data_generation.sh // Read raw data
./household_codesign_computation.sh // Co-design
./household_sep_design_computation.sh // Sep-design
./household_benchmark_computation.sh // Benchmark
./household_loss_privacy_plot.sh // Plot loss under different privacy budgets
./household_mse_plot.sh // Plot mse for each x_i
```

Caveat: due to randomness running the algorithms twice may produce slightly different results.

* [General Settings, regression] For `valuation` application:
```
cd ./codesign/eval_valuation
./valuation_data_generation.sh // Read raw data
./valuation_train_regressor.sh // Train regressor first 
./valuation_codesign_train_encoder.sh // Co-design
./valuation_sep_design_train_encoder.sh // Sep-design
./valuation_benchmark_train_encoder.sh // Benchmark
./valuation_loss_privacy_plot.sh // Plot loss under different privacy budgets
```

* [General Settings, classification] For `cancer` application:
```
cd ./codesign/eval_cancer
./cancer_data_generation.sh // Read raw data
./cancer_train_classifier.sh // Train classifier first 
./cancer_codesign_train_encoder.sh // Co-design
./cancer_sep_design_train_encoder.sh // Co-design 
./cancer_benchmark_train_encoder.sh // Benchmark
./cancer_loss_privacy_plot.sh // Plot loss under different privacy budgets
```

We also have a few scripts that visualize data/result in other ways, including (take `cancer` as an example):
```
./cancer_data_plot.sh // The distribution of data for each x_i and each label y (only for classification) 
./cancer_train_losses_plot.sh // Train loss evolution (for all the applications)
```

### Tuning Hyperparameters and Models ###

The number of back-propagation steps in each epoch can be specified in `./codesign/ML_functions/LDP_encoder_train.py`.

The encoder/decoder, classifier and regressor model can be further specified in the corresponding files under `codesign/models`. (For our paper, to repeat our experiment given in the appendix, the activation functions of the encoder/decoder model shall be changed manually.) 
