# Implementation of Selective Ensembles for Consistent Predictions

## Dependencies
- Tensorflow = 2.3.0
- datalib-dev
- pandas=1.1.1
- trulens

To create a conda environment for Linux that runs the training code in the paper, do 

```shell
conda env create -f training_env.yml
```

## Training

### Prepare data

1. Download datasets
2. Preprocess and split into train and test
3. Store into files outlined in `get_data.py`

### Train models

1. In an environment with all dependencies installed, run `example_model_training.py`

2. To train models for the Colon dataset, run script `colon_models.py`. This script is separated as the pretrained ResNet models cannot be created with the automatic model creator, `string_to_architecture`, in the script.

Note: this script will generate models without the final Sigmoid or SoftMax layer. It may be beneficial to modify the code to add these layers when saving, or to *always use from_script=True* in evaluation steps to reflect the fact that these models do not have their final layers.

## Evaluations

### Modify the model paths

1. Update `model_strings.py` to point to the pretrained models.

### Get predictions from aggregate models

1. Run `get_predict_abstain.py` (if binary) or `get_predict_abstain_multiclass.py` (if multiclass) to generate predictions for selective ensemble models.

### Get saliency maps for all models

1. To generate feature attributions and similarity metrics for all models, *after* running the above script to get the predictions, run `get_agg_saliency.py`

2. *After* `get_agg_saliency.py` is run, run `get_saliency_selective.py` to get feature attribution metrics for selective ensembes. Run `get_saliency_comp.py` to get a baseline comparison for these metrics, which generates similarity metrics between multiple different points within one model for each dataset.





