### Introduction
We reuse the dataset preprocessing published by FedProx (https://arxiv.org/abs/1812.06127).


```
python generate_synthetic.py
```

DATASET: synthetic_0_0 train.json
30 users
4305 samples (total)
143.50 samples per user (mean)
num_samples (std): 238.84
num_samples (std/mean): 1.66
num_samples (skewness): 3.51

num_sam num_users
0        0
20       0
40       14
60       2
80       3
100      5
120      3
140      0
160      0
180      0

DATASET: synthetic_0_0 train.json and test.json
60 users
4801 samples (total)
80.02 samples per user (mean)
num_samples (std): 181.39
num_samples (std/mean): 2.27
num_samples (skewness): 4.91

num_sam num_users
0        27
20       1
40       14
60       2
80       4
100      5
120      4
140      0
160      0
180      0
