
# README

This folder contains the supplementary material for the submission _Representational Geometry Collapse in ANNs Limits Semantic Generalization: A Dimensionality Mismatch Between Brains and ANNs_. The supplementary material text is named `supplementary_material.pdf`.

## Installation

To replicate the figures, first install the environment `submission` defined in `environment.yml` using `conda`:

```bash
conda env create -f environment.yml
```

To activate it, run:
```bash
conda activate submission
```

Then install the `src` package:
```bash
pip install -e src
```


## Preprocess Data

The first step to replicate the analysis and figures is to preprocess the two datasets used in the submission, `THINGS-fMRI` and `BOLD5000`. To download the datasets, please refer to the papers (cited in the main text submission).  Also we to process the subjects using `freesurfer` (also can be downloaded from the dataset repositories). We need to register the surfaces to the HCP atlas (Download from [2016 Glasser MMP1.0 Cortical Atlases](https://figshare.com/articles/dataset/2016_Glasser_MMP1_0_Cortical_Atlases/24431146)). The script `registration_HCP.sh` in `preprocessing` folder registers each subject to the HCP atlas. We can edit the `HCP-MMP1.annot` files created and remove all regions that don't belong to the visual cortex. Then we can save this new file as `visual.annot` and register again the subjects, as showed in `registration_visual.HCP`.

The two notebooks in `preprocessing` copy and preprocess some files from the raw downloaded datasets and organizes them in a similar structure to make the replication of analysis and figures easier across datasets.

Before running the notebooks, define first the paths to some of the original files, defined in `preprocessing/config.yaml`. The notebooks should take between 30-60 mins per subject.

We flatten the meshes for visualization using `pycortex` in the `flatten.py` file in the `preprocessing` folder.

## Generate Results
The scripts in the `script` folder contains code necessary to generate results used for analysis. Some of the scripts have to be run in sequential order, as they are dependent of previous results. In order:

- `neighbors.py`: Calculates the nearest neighbors for each vertex of the surface mesh.

- `centroids.py`: Calculates the centroids for each one of the 26 regions in the visual areas (see _tableA2_). Saves files in the `freesurfer` folder of each subject.

- `distances.py`: Calculates the distance from V1 to the rest of the centroids using geodesic distance over the surface mesh.

Then, the rest of the scripts can be run in any order. The code to run analysis on the surface mesh using brain data:

- `decode_stimuli.py`: Decoding of stimuli and categories using leave-one-session-out cross-validation. It generates the results for stimuli (for `split=test`) and categories (for `split=train`).

- `decode_concepts.py`: Decoding of concepts (heavy, moves, natural, size) using leave-one-session-out cross-validation.

- `decode_components.py`: Decoding of principal components (first, second, third, fourth) using leave-one-session-out cross-validation.

- `dimensionality.py`: Calculates the effective and intrinsic dimensionality per vertex.

The code to create the models results that are saved in the `models`folder:

- `activations_models_THINGS.py`: Generates global pooled features for the different models.

- `activations_models_THINGS_untrained.py`: Generates global pooled features for the different models with model weights reset.

- `dimensionality_models_gap_subsample.py`: Calculates the dimensionality of for the different models by performing global average pooling and subsampling features.

- `dimensionality_models_random_projections.py`: Calculates the dimensionality of for the different models by performing random projections.

- `dimensionality_models_units_subsample.py`: Calculates the dimensionality of for the different models by subsampling single-unit features.


## Reproduce Figures
Once the results are generated, we can reproduce the figures from the main text running the notebooks in the folder `figures`.