Directions on how to setup code.



Step 1 (Download Datasets):

Waterbirds Dataset:
    -- Backgrounds (We used the augmented backgrounds linked in the repo): https://github.com/mymakar/causally_motivated_shortcut_removal/tree/master/waterbirds
    -- Birds and Segmentations: https://www.vision.caltech.edu/datasets/cub_200_2011/
KOA Dataset: https://nda.nih.gov/oai
Food Review: https://www.kaggle.com/datasets/snap/amazon-fine-food-reviews



Step 2 (Fill in constants to link dataset files to experiment scripts):

Once the dataset files have been downloaded, you need to create directories to store the
files for each dataset. Next, you need to fill out the the appropriate constants
in the const.py file with the directory locations. The constants for the dataset files downloaded
from the internet have "RAW_DATA_DIR" in the name. Additionally, another directory
will need to be created for each dataset, that will be used to store the generated datasets.
The constants for these directories have "DATASET_DIR" in the name.



Step 3 (Process datasets):

When the constants have been filled out, you will now need to process the raw data.
For the KOA data, first run the extract_zip_data.py and dicom_to_png.py scripts.



Step 4 (Create the datasets):

create_datasets.py is used to create all of the datasets.



Step 5 (Run experiments):

teacher_cross_val.py: Is used for performing cross validation for the teacher models used in TIPMI.
mm_cross_val.py: Is used for performing cross validation for the mediator models used in MBM.
cross_val.py: Is used for performing cross validation for each model.
evaluate.py: Is used to generate the final results for each model.
