#### A bit about the code
We decided to structure the project around experiments. Each experiment has its own separate package and uses the underlying infrastructure developed. We decided against reusing networks across experiments, as sometimes small changes occur to fit the dataset (like the first convolution of resnets needs to have a kernel size of 3 for cifar), which would cascade changes in the save, load initialization and potentially introduce lots of bugs. In the future if the experiments expand more, we welcome the idea of network reuse, by having different by reusing different network variations, but right now it would bring more harm than good