# Detecting Modularity in Deep Neural Networks

## Coding Environment Setup

Requirements: Python 3.7 (not tested with earlier versions)

The environment is set up in a Python virtual environment. To do so:

1. Clone this repository

2. Install `graphviz`
   1. Ubuntu/Debian: `apt intall graphviz`
   2. MacOS: `brew install graphviz`

3. Install with `pipenv install --dev`

4. On MacOS **only**, you will need to install `pygraphviz` separatly:
   `pipenv run pip install pygraphviz --install-option="--include-path=/usr/local/Cellar/`

5. To set up the dependencies and finish, type `cd nn_clustering` and `./build.sh`

6. To prep the virtual environment, type `pipenv shell`

7. Install packages with `pipenv install --system`. Additionally, post-hoc install: `pipenv install image-classifiers==1.0.0 tensorflow-datasets`

One final note: in several parts of the repository, absolute paths are coded which start with `project/nn_clustering/`. This works for running things through a docker container, but for the purposes of review, instructions for the docker container are not included here. If needed, we recommend doing a project-wide find and replace for places in which these absolute paths are used. 

## Instructions

This requires the Imagenet2012 validation dataset. You will need to register at [http://www.image-net.org/download-images](http://www.image-net.org/download-images) and obtain a link to download the dataset. Then execute the following.
```bash
mkdir datasets
mkdir datasets/imagenet2012
cd datasets/imagenet2012
wget [YOUR LINK HERE]
```
That's all! When running experiments with imagenet models, tfrecords will be created automatically from the `.tar` file.

See `shells/prepare_all.sh` for commands to make datasets, train networks, cluster, and perform experiments. We use `make` with a `Makefile` to streamline running things. But rather than simply running `shells/prepare_all.sh`, you may wish to only parts of it. It will take a long time, even with a GPU.
