#  'Patch-level Kernel Alignment for Self-Supervised Dense Representation Learning'

## Table of Contents

- [Environment Setup](#environment-setup)
- [Training](#training-setup)
- [Evaluation](#evaluation)
- [Dataset Preparation](#datasets)
- [Citation](#citation)


## Environment Setup
We use conda for dependency management. 
Please use `environment.yml` to install the environment necessary to run everything from our work. 
You can install it by running the following command:
```bash
conda env create -f environment.yaml
```
Or you can see the step by step process in the [Installation Guide](INSTALLATION.md) guide.

#### Pythonpath
Export the module to PYTHONPATH within the repository's parent directory.
`
export PYTHONPATH="${PYTHONPATH}:$PATH_TO_REPO"
`

#### PaKA on Dinov2 
```python
import torch
# change to dinov2_vitb14 for base as described in:
#    https://github.com/facebookresearch/dinov2
model =  torch.hub.load('facebookresearch/dinov2', 'dinov2_vits14') 
path_to_checkpoint = "<your path to downloaded ckpt>"
state_dict = torch.load(path_to_checkpoint)
model.load_state_dict(state_dict, strict=False)
```

#### PaKA on Dinov2 with Registers
```python
import torch
# change to dinov2_vitb14_reg for base as described in:
#    https://github.com/facebookresearch/dinov2
model =  torch.hub.load('facebookresearch/dinov2', 'dinov2_vits14_reg') 
path_to_checkpoint = "<your path to downloaded ckpt>"
state_dict = torch.load(path_to_checkpoint)
model.load_state_dict(state_dict, strict=False)
```

#### timm vit-small and vit-base architectures
```python
import torch
from timm.models.vision_transformer import vit_small_patch16_224, vit_base_patch16_224
# Change to vit_base_patch8_224() if you want to use our larger model
model = vit_small_patch16_224()  
path_to_checkpoint = "<your path to downloaded ckpt>"
state_dict = torch.load(path_to_checkpoint, map_location='cpu')
model.load_state_dict(state_dict, strict=False)
```

**Note:** In case you want to directly load the weights of the model from a hugging face url, please execute:
```python
import torch
state_dict = torch.hub.load_state_dict_from_url("<url to the hugging face checkpoint>")
```

### Repository Structure

- `src/`: Model, method, and transform definitions
- `experiments/`: Scripts for setting up and running experiments
- `data/`: Data modules for ImageNet, COCO, Pascal VOC, and ADE20k

### Training with PaKA

- Use configs in `experiments/configs/` to reproduce our experiments
- Modify paths in config files to match your dataset and checkpoint directories
- For new datasets:
  1. Change the data path in the config
  2. Add a new data module
  3. Initialize the new data module in `experiments/train_with_paka.py`

For instance, to start a training on COCO:

```bash
python experiments/train_with_paka.py --config_path experiments/configs/paka_224x224.yml
```

## Evaluation

We provide several evaluation scripts for different tasks. For detailed instructions and examples, please refer to the [Evaluation README](evaluation_README.md). Here's a summary of the evaluation methods:

1. **Linear Segmentation**: 
   - Use `linear_finetune.py` for fine-tuning.
   - Use `eval_linear.py` for evaluating on the validation dataset.

2. **Overclustering**:
   - Use `eval_overcluster.py` to evaluate overclustering performance.


## Datasets

We use PyTorch Lightning data modules for our datasets. Supported datasets include ImageNet100k, COCO, Pascal VOC, ADE20k, and Cityscapes. Each dataset requires a specific folder structure for proper functioning.

Data modules are located in the `data/` directory and handle loading, preprocessing, and augmentation. When using these datasets, ensure you update the paths in your configuration files to match your local setup.

For detailed information on dataset preparation, download instructions, and specific folder structures, please refer to the [Dataset README](DATASET.md).



#### Note: Our repository is developed by adopting and adapting multiple parts of the [NeCo](https://github.com/vpariza/NeCo) model, as well as parts from other works like DINOv2, DINO, R-CNN, ...





