Task-Oriented Diffusion Inversion for High-Fidelity Text-based Editing

Yangyang Xu; Wenqi Shao; Yong Du; Haiming Zhu; Yang Zhou; Jiayuan Xie; Ping Luo; Shengfeng He

Task-Oriented Diffusion Inversion for High-Fidelity Text-based Editing

Yangyang Xu, Wenqi Shao, Yong Du, Haiming Zhu, Yang Zhou, Jiayuan Xie, Ping Luo, Shengfeng He

18 Sept 2024 (modified: 12 Nov 2024)ICLR 2025 Conference Withdrawn SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Diffusion Models, Inversion, Image Edit

TL;DR: We introduce a novel framework for high-fidelity and task-specific real image editing by optimizing prompt embeddings across U-Net layers and time steps, achieving superior performance in structure, appearance, and global edits.

Abstract: Recent advancements in text-guided diffusion models have unlocked powerful image manipulation capabilities, yet balancing reconstruction fidelity and editability for real images remains a significant challenge. In this work, we introduce TaskOriented Diffusion Inversion (TODInv), a novel framework that inverts and edits real images tailored to specific editing tasks by optimizing prompt embeddings within the extended P ∗ space. By leveraging distinct embeddings across different U-Net layers and time steps, TODInv seamlessly integrates inversion and editing through reciprocal optimization, ensuring both high fidelity and precise editability. This hierarchical editing mechanism categorizes tasks into structure, appearance, and global edits, optimizing only those embeddings unaffected by the current editing task. Extensive experiments on benchmark dataset reveal TODInv’s superior performance over existing methods, delivering both quantitative and qualitative enhancements while showcasing its versatility with few-step diffusion model.

Primary Area: generative models

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.

Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.

Reciprocal Reviewing: I understand the reciprocal reviewing requirement as described on https://iclr.cc/Conferences/2025/CallForPapers. If none of the authors are registered as a reviewer, it may result in a desk rejection at the discretion of the program chairs. To request an exception, please complete this form at https://forms.gle/Huojr6VjkFxiQsUp6.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Submission Number: 1672

Loading