# Setup:

## Installation and Startup

```bash
pip install -r requirements.txt
jupyter-lab
```

## DB Setup Helpers:

start db cluster

`sudo pg_ctlcluster 13 main start` or `sudo service postgresql start/stop/restart`

login to admin

`sudo -u postgres psql`

add user

`sudo -u postgres createuser`

# Workflow:

- create folder for experiment group which will have a consistent set of parameters. (i.e. `runs` columns)
- create db tables to hold results data with `prepare_database.ipynb`.
- populate runs table with a `sweep_definitions/` notebook.
- run new runs with `execute_incomplete_runs.py`, populating `run_notebooks/`
- analyze runs with a `run_analyses/` notebook.

# TODO:

- find a way for run definition files to be idempotent, so running the same code twice won't double up runs, and so more runs can be added by just adding more in the list.  maybe andy's combinatorial json frontend is pretty reasonable after all

- compute canada runner which receives a set of run ids.  probably replaces the `Pool`, and just runs those run ids in a loop

- rather than creating notebooks for every run, make a papermill utility to create a notebook from a particular run.  it's all deterministic anyways, why carry around heavy folders.

- use ENV variables or config files for db connection details.

- use ENV vars or config files for num_threads

- use 'sandboxes/' directory for the hand tweaked notebooks we use to rough out the edges of a systematic sweep

- try bigquery - it's cheap and we're dropping entire tables at a time anyways.