# DirectInversion


This repository contains the implementation of the paper "Direct Inversion: Boosting Diffusion-based Editing with 3 Lines of Code" 

Keywords: Diffusion Model, Image Inversion, Image Editing




**📖 Table of Contents**

- [DirectInversion](#directinversion)
  - [🚀 Getting Started](#-getting-started)
  - [🏃🏼 Running Scripts](#-running-scripts)
    - [Inference 📜](#inference-)
    - [Evaluation 📐](#evaluation-)


## 🚀 Getting Started
<span id="getting-started"></span>


Since different models have different python environmnet requirements (e.g. diffusers' version), we list the environmnet in the folder "environment", detailed as follows:

- p2p_requirements.txt: for models in `run_editing_p2p.py`, `run_editing_blended_latent_diffusion.py`, `run_editing_stylediffusion.py`, and `run_editing_edit_friendly_p2p.py`
- instructdiffusion_requirements.txt: for models in `run_editing_instructdiffusion.py` and `run_editing_instructpix2pix.py`
- masactrl_requirements.txt: for models in `run_editing_masactrl.py`
- pnp_requirements.txt: for models in `run_editing_pnp.py`
- pix2pix_zero_requirements.txt: for models in `run_editing_pix2pix_zero.py`
- edict_requirements.txt: for models in `run_editing_edict.py`

For example, if you want to use the models in `run_editing_p2p.py`, you need to install the environment as follows:

```shell
conda create -n p2p python=3.9 -y
conda activate p2p
conda install pytorch==1.12.1 torchvision==0.13.1 torchaudio==0.12.1 cudatoolkit=11.3 -c pytorch
pip install -r environment/p2p_requirements.txt
```

## 🏃🏼 Running Scripts
<span id="running-scripts"></span>

### Inference 📜
<span id="inference"></span>

**Run the Benchmark**

You can run the whole image editing results through `run_editing_p2p.py`, `run_editing_edit_friendly_p2p.py`, `run_editing_masactrl.py`, `run_editing_pnp.py`, `run_editing_edict.py`, `run_editing_pix2pix_zero.py`, `run_editing_instructdiffusion.py`, `run_editing_blended_latent_diffusion.py`,`run_editing_stylediffusion.py`, and `run_editing_instructpix2pix.py`. These python file contains models as follows (please unfold):

<details> <summary> run_editing_p2p.py </summary>

| Inversion Method | Editing Method | Index | Explanation
| :-----: | :----: | :----: | :----: |
| DDIM | Prompt-to-Prompt |  ddim+p2p |  |
| Null-text Inversion | Prompt-to-Prompt | null-text-inversion+p2p |  |
| Negative-prompt Inversion | Prompt-to-Prompt | negative-prompt-inversion+p2p |  |
| DirectInversion(Ours) | Prompt-to-Prompt | directinversion+p2p |  |
| DirectInversion(Ours) (ablation: with various guidance scale) | Prompt-to-Prompt (ablation: with various guidance scale) | directinversion+p2p_guidance_{i}_{f} | For ablation study. {i} means inverse guidance scale, {f} means forward guidance scale. {i} could be chosen from \[0,1,25,5,75\]. {f} could be chosen from \[1,25,5,75\]. For example, directinversion+p2p_guidance_1_75 means inverse with gudiance scale 1.0, forward with 7.5. |
| Null-text Inversion | Proximal Guidance | null-text-inversion+proximal-guidance |  |
| Negative-prompt Inversion | Proximal Guidance | negative-prompt-inversion+proximal-guidance |  |
| Null-latent Inversion | Prompt-to-Prompt | ablation_null-latent-inversion+p2p | For ablation study. Edit the Null-text Inversion to Null-latent Inversion. |
| Null-Text Inversion  (ablation: single branch) | Prompt-to-Prompt | ablation_null-text-inversion_single_branch+p2p | For ablation study. Edit the Null-text Inversion to exchange null embedding only in source branch. |
| DirectInversion(Ours) (ablation: add with scale) | Prompt-to-Prompt (ablation: add with scale) | ablation_directinversion_{s}+p2p | For ablation study. {s} means the added scale. {s} could be chosen from \[04,08\]. For example, ablation_directinversion_02+p2p means add with scale=0.2. |
| DirectInversion(Ours) (ablation: skip step) | Prompt-to-Prompt (ablation: skip step) | ablation_directinversion_interval_{s}+p2p | For ablation study. {s} means the skip step. {s} could be chosen from \[2,5,10,24,49\]. For example, ablation_directinversion_interval_2+p2p means skip every 2 steps. |
| DirectInversion(Ours) (ablation: add source offset for target latent) | Prompt-to-Prompt (ablation: add source offset for target latent) | ablation_directinversion_add-source+p2p |  |
| DirectInversion(Ours) (ablation: add target offset for target latent) | Prompt-to-Prompt (ablation: add target offset for target latent) | ablation_directinversion_add-target+p2p |  |
</details>

<details> <summary> run_editing_stylediffusion.py </summary>

| Inversion Method | Editing Method | Index | Explanation
| :-----: | :----: | :----: | :----: |
| StyleDiffusion | Prompt-to-Prompt |  stylediffusion+p2p |  |
</details>

<details> <summary> run_editing_edit_friendly_p2p.py </summary>

| Inversion Method | Editing Method | Index | Explanation
| :-----: | :----: | :----: | :----: |
| Edit Friendly Inversion | Prompt-to-Prompt |  edit-friendly-inversion+p2p |  |
</details>


<details> <summary> run_editing_masactrl.py </summary>

| Inversion Method | Editing Method | Index | Explanation
| :-----: | :----: | :----: | :----: |
| DDIM | MasaCtrl |  ddim+masactrl |  |
| DirectInversion(Ours) | MasaCtrl |  directinversion+masactrl |  |
</details>


<details> <summary> run_editing_pnp.py </summary>

| Inversion Method | Editing Method | Index | Explanation
| :-----: | :----: | :----: | :----: |
| DDIM | Plug-and-Play |  ddim+pnp |  |
| DirectInversion(Ours) | Plug-and-Play |  directinversion+pnp |  |
</details>


<details> <summary> run_editing_pnp.py </summary>

| Inversion Method | Editing Method | Index | Explanation
| :-----: | :----: | :----: | :----: |
| DDIM | Pix2Pix-Zero |  ddim+pix2pix-zero |  |
| DirectInversion(Ours) | Pix2Pix-Zero |  directinversion+pix2pix-zero |  |
</details>


<details> <summary> run_editing_edict.py </summary>

| Inversion Method | Editing Method | Index | Explanation
| :-----: | :----: | :----: | :----: |
| EDICT | |  edict+direct_forward |  |

</details>



<details> <summary> run_editing_instructdiffusion.py </summary>

| Inversion Method | Editing Method | Index | Explanation
| :-----: | :----: | :----: | :----: |
|  | InstructDiffusion |  instruct-diffusion |  |

</details>


<details> <summary> run_editing_instructpix2pix.py </summary>

| Inversion Method | Editing Method | Index | Explanation
| :-----: | :----: | :----: | :----: |
|  | Instruct-Pix2Pix |  instruct-pix2pix |  |

</details>


<details> <summary> run_editing_blended_latent_diffusion.py </summary>

| Inversion Method | Editing Method | Index | Explanation
| :-----: | :----: | :----: | :----: |
|  | Blended Latent Diffusion |  blended-latent-diffusion |  |

</details>



For example, if you want to run DirectInversion(Ours) + Prompt-to-Prompt, you can find this method has an index `directinversion+p2p` in `run_editing_p2p.py`. Then, you can run the editing type 0 with DirectInversion(Ours) + Prompt-to-Prompt through:

```
python run_editing_p2p.py --output_path output --edit_category_list 0 --edit_method_list directinversion+p2p
```

You can also run multiple editing methods and multi editing type with:

```
python run_editing_p2p.py --edit_category_list 0 1 2 3 4 5 6 7 8 9 --edit_method_list directinversion+p2p null-text+p2p
```

You can also specify --rerun_exist_images to choose whether rerun exist images. You can also specify --data_path and --output for image path and output path. 


**Run Any Image**

You can process your own images and editing prompts to the same format as our given benchmark to run large number of images. You can also edit the given python file to your own image. We have given out the edited python file of `run_editing_p2p.py` as `run_editing_p2p_one_image.py`. You can run one image's editing through:

```shell
python -u run_editing_p2p_one_image.py --image_path scripts/example_cake.jpg --original_prompt "a round cake with orange frosting on a wooden plate" --editing_prompt "a square cake with orange frosting on a wooden plate" --blended_word "cake cake" --output_path "directinversion+p2p.jpg" "ddim+p2p.jpg" --edit_method_list "directinversion+p2p" "ddim+p2p"
```

We also provide jupyter notebook demo `run_editing_p2p_one_image.ipynb`.

Noted that we use default parameters in our code. However, it is not optimal for all images. You may ajust them based on your inputs.

### Evaluation 📐
<span id="evaluation"></span>

You can run evaluation through:

```
python evaluation/evaluate.py --metrics "structure_distance" "psnr_unedit_part" "lpips_unedit_part" "mse_unedit_part" "ssim_unedit_part" "clip_similarity_source_image" "clip_similarity_target_image" "clip_similarity_target_image_edit_part" --result_path evaluation_result.csv --edit_category_list 0 1 2 3 4 5 6 7 8 9 --tgt_methods 1_ddim+p2p 1_directinversion+p2p
```

You can find the choice of tgt_methods in `evaluation/evaluate.py` with the dict "all_tgt_image_folders". 

```
output
  |-- ddim+p2p
    |-- annotation_images
      |-- ...
  |-- directinversion+p2p
    |-- annotation_images
      |-- ...
...    
```


If you want to evaluate the whole table's results shown in our paper, you can run:


```
python evaluation/evaluate.py --metrics "structure_distance" "psnr_unedit_part" "lpips_unedit_part" "mse_unedit_part" "ssim_unedit_part" "clip_similarity_source_image" "clip_similarity_target_image" "clip_similarity_target_image_edit_part" --result_path evaluation_result.csv --edit_category_list 0 1 2 3 4 5 6 7 8 9 --tgt_methods 1 --evaluate_whole_table
```

Then, all results in the table 1 will be output in evaluation_result.csv.
