# Inducing High Energy-Latency of Large Vision-Language Models with Verbose Images (submitted to ICLR 2024, ID: 569)

## Requirements

My environment is 

python==3.9.2   pytorch==1.13.1   torchvision==0.14.1  cuda version==11.6

## Verbose images

Download the LAVIS library [1]. 

Move src/blip2_opt.py into lavis/models/blip2_models/.

Move src/main.py into tests/models/.

- src/blip2_opt.py: The loss objectives of our verbose images.

- src/main.py: The core algorithm of our verbose images.

- src/run.sh: The script to generate our verbose images.

## Run experiments

```
bash run.sh
```

[1] Dongxu Li, Junnan Li, Hung Le, Guangsen Wang, Silvio Savarese, and Steven CH Hoi. Lavis: A one-stop library for language-vision intelligence. In ACL, 2023.
