Keywords: Consistency Models, Image Editing, Diffusion Models
TL;DR: A novel four-step approach for high-fidelity image editing.
Abstract: Recent advances in diffusion-based image editing have achieved impressive results, offering fine-grained control over the generation process. However, these methods are computationally expensive due to their iterative nature. While distilled diffusion models enable faster inference, their editing capabilities remain limited - primarily because of poor inversion quality. High-fidelity inversion and reconstruction are essential for precise image editing, as they preserve the structural and semantic integrity of the source image. In this work, we propose a simple, general framework that optimizes the diffusion model over the entire inversion and generation trajectory and is compatible with arbitrary accelerated diffusion backbones, enabling high-quality editing in under one second. We achieve state-of-the-art performance across various image editing tasks, accelerated diffusion models, and datasets, demonstrating that our method matches or surpasses full-step diffusion models while being substantially more efficient.
Primary Area: generative models
Submission Number: 19392
Loading