# Link for UniD3 ckpt based on CUB-200

[Google Drive Link for CUB model](
https://drive.google.com/file/d/1dVRp3lPrWS0EWFViYG3Bj_tHmD3riVZP/view?usp=sharing)

# Usage

1. Install the required packages according to ``requirements.txt``, e.g. ``pip install -r requirements.txt``
2. Download the provided CUB checkpoint(~7GB) [Google Drive Link for CUB model](
https://drive.google.com/file/d/1dVRp3lPrWS0EWFViYG3Bj_tHmD3riVZP/view?usp=sharing) and put it anywhere but remember where you put it.
3. Download the released VQ-GAN model [GumbelVQGAN on OpenImages](https://drive.google.com/file/d/1mava_wCbeEtD4voWgeNaVVwJ8yrIQFgs/view?usp=sharing) and put them under ``./misc/taming_dvae/``. The file name should be ``taming_f8_8192_openimages_last.pth``. Also you could download it from VQ-GAN or VQ-Diffusion official cite.
4. Run commands ``python ./UniDiff/dist_eval_sample.py --model CKPT_PATH  --condition unconditional --log pair_samples --batch_size 1``. Remember to change the CKPT_PATH to the download model path.
5. Check the generated Vision-Language Pairs in ``./pair_samples``

Suggested 6G+ VRAM.

Have Fun.
