# README of Label Leakage and Protection

## Requirements

* Python 3.x
* Tensorflow 2.x

## Dataset

* For demonstration, we have provided a small portion (0.5%) of Criteo dataset in ```dataset/criteo```. 
* Interested readers can use above dataset to test our code.
* For the actual datasets (Avazu, ISIC), readers should download the datasets
Criteo: https://www.kaggle.com/c/criteo-display-ad-challenge/data
Avazu: https://www.kaggle.com/c/avazu-ctr-prediction/data
ISIC: https://www.kaggle.com/c/siim-isic-melanoma-classification/data
and use
```preprocess_criteo_subset.py```
```preprocess_avazu.py``` and ```preprocess_ISIC.ipynb``` to preprocess the corresponding datasets. 

## Run

We provide a script for each dataset to test our protection methods. 

* ```run_script_criteo.py``` for Criteo
* ```run_script_avazu.py``` for Avazu
* ```run_script_isic.py``` for ISIC

* In each script, we have provided configurations to run ```Marvell```, ```max_norm```, and ```iso``` and ```no_noise```

## Visualization 

* We have provided  ```diff_methods_tradeoff_viz_save_memory-*.ipynb``` to visualize the tradeoff results in tensorboard logs.  

