EDIF: Editing via Dynamic Interactive Tuning with Feedback

EDIF: Editing via Dynamic Interactive Tuning with Feedback

ICLR 2026 Conference Submission2331 Authors

05 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: diffusion models, image editing, scene-centric editing, feedback, Pareto optimization

TL;DR: A feedback-driven diffusion framework that adaptively adjusts conditioning layer by layer to balance structural preservation and semantic alignment.

Abstract: Although text-guided image editing (TIE) has advanced rapidly, most prior works remain object-centric and rely on attention maps or masks to localize and modify specific objects. In this paper, we propose a method of Editing via Dynamic Interactive Tuning (EDIF) that adaptively trades off source-image structure and instruction fidelity in difficult scene-centric editing settings. Unlike object editing, scene-centric editing is challenging because the target cannot be clearly localized, and edits need to preserve global structure. To cope with the limitation of TIE systems that typically use a unified conditioning signal and ignore the block-wise variation in the internal behavior of the model, we show that inside the model, the source-image condition and the text-prompt embedding act with layer-dependent directions and strengths. We also demonstrate both empirically and the oretically that the editing state can be diagnosed using the source image signal-to-noise ratio and VLM logits, which indicate whether the edited image faithfully reflects the intended editing prompt. By constructing a Pareto line between these two objectives, EDIF adaptively modulates the source-image and editing-text conditions, guiding each denoising step to stay close to this line for balanced optimization. Extensive experiments on ImgEdit, EmuEdit-Bench, and Places365 show that EDIF achieves state-of-the-art performance in various scene-editing scenarios, including indoor and outdoor environments.

Supplementary Material: pdf

Primary Area: generative models

Submission Number: 2331

Loading