LightningDrag: Lightning Fast and Accurate Drag-based Image Editing Emerging from Videos

Yujun Shi; Jun Hao Liew; Hanshu Yan; Vincent Y. F. Tan; Jiashi Feng

LightningDrag: Lightning Fast and Accurate Drag-based Image Editing Emerging from Videos

Yujun Shi, Jun Hao Liew, Hanshu Yan, Vincent Y. F. Tan, Jiashi Feng

Published: 01 May 2025, Last Modified: 23 Jul 2025ICML 2025 posterEveryoneRevisionsBibTeXCC BY 4.0

TL;DR: A fast and high-quality drag-based image editing approach on general images.

Abstract: Accuracy and speed are critical in image editing tasks. Pan et al. introduced a drag-based framework using Generative Adversarial Networks, and subsequent studies have leveraged large-scale diffusion models. However, these methods often require over a minute per edit and exhibit low success rates. We present LightningDrag, which achieves high-quality drag-based editing in about one second on general images. By redefining drag-based editing as a conditional generation task, we eliminate the need for time-consuming latent optimization or gradient-based guidance. Our model is trained on large-scale paired video frames, capturing diverse motion (object translations, pose shifts, zooming, etc.) to significantly improve accuracy and consistency. Despite being trained only on videos, our model generalizes to local deformations beyond the training data (e.g., lengthening hair, twisting rainbows). Extensive evaluations confirm the superiority of our approach, and we will release both code and model.

Lay Summary: Editing images by “dragging” parts of them --- like stretching a smile or moving an object --- is powerful but often slow or limited to narrow domains. Our work introduces LightningDrag, a new AI tool that enables fast and precise drag-based editing on a wide variety of images. Instead of relying on manually labeled data, our method learns from videos, using them to understand how parts of objects move and deform. We also design special inference strategies to improve both the realism of the results and the accuracy of the edits. LightningDrag runs in under a second and works on general images, from faces to abstract scenes, without being confined to a single type of object.

Application-Driven Machine Learning: This submission is on Application-Driven Machine Learning.

Link To Code: https://github.com/magic-research/LightningDrag

Primary Area: Applications->Computer Vision

Keywords: drag-based image editing, diffusion model

Submission Number: 9847

Loading