Keywords: Text-based image editing
Abstract: Text-based image editing aims to generate an image that corresponds to the given text prompt, but with the structure of the original source image. Existing methods often rely on attention maps in diffusion models (DMs) for structure preservation, as these features are considered to play a primary role in determining the spatial layout. However, we find that these methods struggle to preserve the spatial layout when applied to few-step DMs (e.g., SDXL-Turbo), limiting their use cases to the slower multi-step DMs (e.g., Stable Diffusion). In this work, we investigate the limitations of these approaches in terms of intermediate feature representations. Our findings indicate that for few-step DMs, the attention layers have less influence in determining the structure. To tackle this, we localize layers within the network that better control spatial layout and inject these features during the editing process. Additionally, we disentangle structural information from other features to avoid conflicts between the injected features and the text prompt. This ensures that the edited image faithfully follows the prompt while preserving the source structure. Our method outperforms existing text-based editing baselines.
Supplementary Material: zip
Primary Area: generative models
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.
Reciprocal Reviewing: I understand the reciprocal reviewing requirement as described on https://iclr.cc/Conferences/2025/CallForPapers. If none of the authors are registered as a reviewer, it may result in a desk rejection at the discretion of the program chairs. To request an exception, please complete this form at https://forms.gle/Huojr6VjkFxiQsUp6.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 7531
Loading