Forgedit: Text Guided Image Editing via Learning and Forgetting

20 Sept 2023 (modified: 11 Feb 2024)Submitted to ICLR 2024EveryoneRevisionsBibTeX
Primary Area: generative models
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Keywords: Text guided image editing, Diffusion Models, Image manipulation
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.
TL;DR: A novel text guided image editing framework to enable Diffusion Models to conduct complex rigid and non-rigid image editing, preserving the characteristics of original image at the same time.
Abstract: Text guided image editing is a recently popular but challenging task. It requires an editing model to estimate by itself which part of the image should be edited, and then perform complicated non-rigid editing while preserving the characteristics of original image. Previous fine-tuning based approaches are often time-consuming and vulnerable to overfitting, which catastrophically limits their editing capabilities. To tackle these issues, we design a novel text guided image editing method, named as Forgedit. First, we propose a novel fine-tuning framework able to reconstruct a given image efficiently by jointly learning vision and language information. Then we introduce vector subtraction and projection mechanisms to explore accurate text embeddings for editing. We also find a general property of UNet structures in Diffusion Models, which inspired us to design new forgetting strategies to diminish the fatal overfitting issues, significantly boosting the editing abilities of Diffusion Models. Our method, Forgedit, built on Stable Diffusion, achieves new state-of-the-art results on the challenging text guided image editing benchmark: TEdBench, surpassing the previous SOTA methods such as Imagic (even built on stronger Imagen), in terms of both CLIP score and LPIPS score.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.
Supplementary Material: zip
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 2491
Loading