This is the main source code for our paper, which is built upon the BLIP-2 repository available at: https://github.com/salesforce/LAVIS.
The blip2_qformer.py and Qformer.py files represent our adapted version of the BLIP-2 model, specifically designed for unsupervised prompt learning. Additionally, the modules folder contains image augmentation and loss functions that are utilized in our implementation.
For performing the training process, you can refer to the train.py file.