# CoT-PL: Visual Chain-of-Thought Reasoning Meets Pseudo-Labeling for Open-Vocabulary Object Detection

## Introduction

This is an unofficial release of the paper **CoT-PL: Visual Chain-of-Thought Reasoning Meets Pseudo-Labeling for Open-Vocabulary Object Detection** for submission to ICLR 2026.

        IMPORTANT: In accordance with code submission requirements, the provided skeleton code is intended solely to demonstrate the methodological flow and is not executable in its current form.

        The workable version will be provided upon publication.

## Installation

This project is based on MMDetection 3.x.

It requires the following OpenMMLab packages:

- MMEngine >= 0.6.0
- MMCV-full >= v2.0.0rc4
- MMDetection >= v3.0.0rc6

```bash
pip install openmim mmengine
mim install "mmcv>=2.0.0rc4"
mim install "mmdet>=3.0.0rc6"
```

To ensure compatibility, the following file should be added to your Docker environment to match the required package dependencies.

```bash
docker cp data/anchor_head.py <YOUR_DOCKER_CONTAINER>:/home/<USER_NAME>/.local/lib/python3.7/site-packages/mmdet/models/dense_heads/anchor_head.py
```

## License

This project is released under the [NTU S-Lab License 1.0](LICENSE).


## Usage
### Obtain CLIP Checkpoints
We use CLIP's ViT-B-16 model for the implementation of our method. 

run
```bash
pip install CLIP/.
```

Obtain the state_dict of the model and place it in the `checkpoints/` directory.

run 
```python
import clip
import torch
model, _ = clip.load("ViT-B/16")
torch.save(model.state_dict(), 'checkpoints/clip_vitb16.pth')
```

### Pseudo-Label Generation Process
The pseudo-label generation processes on [Pseudo-Label-Generation](pseudo-labels/README.md) are supported now.

### Training and Testing

The training and testing on [OV-COCO](configs/baron/ov_coco/README.md) are supported now.