# Pool Me Wisely: On the Effect of Pooling in Transformer-Based Models

This is the official implementation of the submitted paper "Pool Me Wisely: On the Effect of Pooling in Transformer-Based Models". 


The implementation is based on PyTorch.
We recommend to install PyTorch independently, according to the specifics of your system, as suggested on [PyTorch website](https://pytorch.org/get-started/locally/).
These packages are necessary to run the code:
- numpy
- pandas
- pytorch
- Torch Vision


## Setup
The code's folder should be divided into the following subfolders:
- data: contains the datasets
- src: contains the implementation of the different models (Classification, InPainting and Segmentation)

## Usage
To run a classification model for instance, the user should specify:
- The dataset
- The Pooling Strategy to be used
- The model type (Base ViT or Small ViT)
- Other Hyper-paramters if wanted

Example to run classification, with the generic hyper-parameters:

```python
python classification.py --dataset CIFAR10 --pooling sum
```

To run the segmentation, with the generic hyper-parameters:
```python
python segmentation.py --pooling sum
```

And to run the In-Painting:
```python
python segmentation.py --dataset_name OxfordIIITPet --pooling sum
```

## Dataset
- Note that for the classification dataset, the loader can directly download the dataset from the right source, but for the ImageNet, the user should download and specify the path.
- For the segmentation, the user should download the Pascal dataset
- For the In-painting, the user should download the CelebA dataset, while the others are directly downloaded by the loader. 

For any additional detail, please refer to our paper. 