Keywords: image editing
Abstract: While recent flow-based image editing models demonstrate general-purpose capabilities across diverse tasks, they often struggle to specialize in challenging scenarios---particularly those involving large-scale shape transformations.
When performing such structural edits, these methods either fail to achieve the intended shape change or inadvertently alter non-target regions, resulting in degraded background quality.
We propose $\textbf{Follow-Your-Shape}$, a training- and mask-free framework that supports precise and controllable editing of object shapes while strictly preserving non-target content.
Motivated by the divergence between inversion and editing trajectories, we compute a $\textbf{Trajectory Divergence Map (TDM)}$ by comparing token-wise velocity differences between the inversion and denoising paths.
The TDM enables precise localization of editable regions and guides a $\textbf{Scheduled KV Injection}$ mechanism that ensures stable and faithful editing.
To facilitate a rigorous evaluation, we introduce $\textit{\textbf{ReShapeBench}}$, a new benchmark comprising 120 new images and enriched prompt pairs specifically curated for shape-aware editing.
Experiments demonstrate that our method achieves superior editability and visual fidelity, particularly in tasks requiring large-scale shape replacement.
Supplementary Material: zip
Primary Area: applications to computer vision, audio, language, and other modalities
Submission Number: 6089
Loading