TGDrag: Adding Semantic Control into Point-based Image Editing via Text Guidance

Published: 2025, Last Modified: 28 May 2026ICASSP 2025EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Controllable image generation has emerged as a cutting-edge subject of interest. Current interactive point-based image editing frameworks, such as DragGAN, achieve impressive results in fine-grained and controllable image editing. However, relying solely on point-based manipulations can lead to unintended outcomes due to the inherent lack of the users’ semantic intent. To address this issue, we introduce Text-Guided Drag (TGDrag), a novel approach to adding semantic control into point-based image editing by using text prompts to guide the manipulation of handle and target points. Specifically, we design a channel correlation calculator that adaptively selects channels for the text and points to mitigate the potential influence of semantic control on point control. Furthermore, we introduce a text loss function to minimize the discrepancy between the generated images and the text prompts. Experimental results demonstrate that TGDrag achieves the expected function of semantic control while maintaining effectiveness regarding point control.
Loading