# M^{3}T2IBench

## Environment Setup
See ```environment.yml```

## Code Organization
Code are put under ```m3t2ibench_code/```, while data are put under ```m3t2ibench_data/```. We provide some scripts under ```scripts/``` to help run our code.

## Code Usage

### Benchmark Generation
You can generate a new set of prompts using the command as follows:

```./scripts/generate_data.sh```

We provide the prompt used in our experiment under ```m3t2ibench_data/```.

### Model Inference

Our model inference code is based on *Diffusers*. The main code is ```m3t2ibench_code/generate.py```.

An example of running this code is:

```python3 m3t2ibench_code/generate.py --model_id stabilityai/stable-diffusion-3-medium-diffusers --save_name sd3 --data_path m3t2ibench_data/ --output_dir outputs/ --end_index 10 --seed 3407 --device cuda ```

Some main arguments are listed as follows:

```--model_id```: The pretrained checkpoint to load using *Diffusers*

```--save_name```: The name of saving the results.

```--data_path```: The path to the prompts directory

```--output_dir```: The path to the output directory

```--end_index```: The number of prompts used for generation (details can be found in ```m3t2ibench_code/generate.py```)

We provide a simple script for running this code as:

```./scripts/model_inference.sh```

### Evaluation 

The evaluation code are ```m3t2ibench_code/clip_for_color_detect.py, m3t2ibench_code/evaluate_benchmark.py, m3t2ibench_code/agg_metrics.py```. The main part is ```m3t2ibench_code/evaluate_benchmark.py```.

An example of running evaluation can be found in ```scripts/eval_generate.sh```. Most arguments are easy to understand. We provide some explanations as:

```--root_path```: The save path of the generated results.

```--det_path```: The checkpoint path to the mask2former.

```--clip_path```: The checkpoint path to CLIP.

### Revise-then-Enforce Generation
It can be quite complicated to provide all R&E code, since it has to be implemented for each certain model. Therefore, we provide an example implementation using Stable-Diffusion-3 series. The generation process contains 2 steps: first construct prompts, then generate using the constructed prompts. The corresponding code are ```m3t2ibench_code/construct_paired_prompt.py, m3t2ibench_code/re_generate.py```.

We present a script for running this code as:

```./scripts/re_inference.sh```

Note that this generation has to be run after the original generation and evaluation is properly conducted, since it relies on the same noise prior and evaluation result of the generation. 

## AcknowledgeMent

Our code is greatly inspired by https://github.com/djghosh13/geneval .
