# Code for "Collaborative  Compressors  in Distributed Mean Estimation with Limited Communication Budget"

## Requirements --
- numpy, scipy, sklearn, pathos, tqdm, tensorboard, pandas, torch

## Config structure --
- `configs/datasets/dataset_name.yaml` : Dataset config for a given dataset. Contains number of clients, dimension, etc.
- `configs/compressors.yaml` : Contains default values for each compressor.
- `configs/tasks/TASK_NAME/base.yaml` : Base parameters for given task
- `configs/tasks/TASK_NAME/DATASET_NAME.yaml`:  Additional parameters for given task + dataset + compressors.


Each config should have task, dataset, compressor
 
## Running Instructions    
Tasks : Distributed Mean estimation, distributed power estimation, distributed k Means, distributed Linear regression.

To run a task on a dataset run for a given seed run 
```
python main.py --task TASK --compressor COMPRESSOR --dataset DATASET --seed SEED
```
To run a task on a dataset for all compressors and seeds run
```
sh all_compressors.sh -d DATASET -t TASK
```
- TASK can be `linreg`, `kmeans` or `power_iter`
- DATASET can be `synthetic`, `ujindoorloc`, `femnist` and `mnist`
- COMPRESSOR can be one of the compressors in `src/compressors.py`

To run distributed mean estimation experiments, use 
```
python mean_est_l_infty.py
python mean_est_l2.py
python mean_est_unit_norm.py
```

