# Reddit Dataset (Preprocessed for Oort)

## Organization

The [dataset](https://fedscale.eecs.umich.edu/dataset/reddit.tar.gz) is splited into training and testing set. Random ids were assigned to each client and client ids are encoded in the file name. 

# References
This dataset is covered in more detail at [https://github.com/TalwalkarLab/leaf/tree/master/data/reddit](https://github.com/TalwalkarLab/leaf/tree/master/data/reddit) and Its original location is at
[https://files.pushshift.io/reddit/](https://files.pushshift.io/reddit/).