DreamEdit: Subject-driven Image Editing

Published: 21 Dec 2023, Last Modified: 21 Dec 2023Accepted by TMLREveryoneRevisionsBibTeX
Abstract: Subject-driven image generation aims at generating images containing customized subjects, which has recently drawn enormous attention from the research community. Nevertheless, the previous works cannot precisely control the background and position of the target subject. In this work, we aspire to fill the void of the existing subject-driven generation tasks. To this end, we propose two novel subject-driven editing sub-tasks, i.e., Subject Replacement and Subject Addition. The new tasks are challenging in multiple aspects: replacing a subject with a customized one can totally change its shape, texture, and color, while adding a target subject to a designated position in a provided scene necessitates a rational context-aware posture of the subject. To conquer these two novel tasks, we first manually curate a new dataset called DreamEditBench containing 22 different types of subjects, and 440 source images, which cover diverse scenarios with different difficulty levels. We plan to host DreamEditBench as a platform and hire trained evaluators for standardized human evaluation. We also devise an innovative method DreamEditor to resolve these tasks by performing iterative generation, which enables a smooth adaptation to the customized subject. In this project, we conduct automatic and human evaluations to understand the performance of our DreamEditor and baselines on DreamEditBench. We found that the new tasks are challenging for the existing models. For Subject Replacement, we found that the existing models are particularly sensitive to the shape and color of the original subject. When the original subject and the customized subject are highly different, the model failure rate will dramatically increase. For Subject Addition, we found that the existing models cannot easily blend the customized subjects into the background smoothly, which causes noticeable artifacts in the generated image. We hope that DreamEditBench can become a standardized platform to enable future investigations towards building more controllable subject-driven image editing. Our project and benchmark homepage is https://dreameditbenchteam.github.io/
License: Creative Commons Attribution 4.0 International (CC BY 4.0)
Submission Length: Long submission (more than 12 pages of main content)
Changes Since Last Submission: 1. Add two more baselines to make the experiments more comprehensive: a) PhotoSwap (photoswap: personalized subject swapping in images) b) CopyHarmonize (The “Inpainting + Copy-Paste + Harmonization” mentioned by one of the reviewers) 2. Modify the proposed automatic evaluation matrix (i.e. using segmentation tools to separate subject from the background), so that the subject-oriented and background-oriented evaluation matrix are independent of each other. 3. Add a failure cases demonstration and analysis in the appendix. 4. Add a broader impact statement about deepfake discussion. 5. Clarify the expression of the model name in the tables. 6. Add related works for subject addition task. 7. Add result comparison and analysis of DreamEditor and Inpainting + Copy-Paste + Harmonization baseline to show our advantages on hard cases in Appendix.
Code: https://dreameditbenchteam.github.io/
Assigned Action Editor: ~Jia-Bin_Huang1
Submission Number: 1314