Auto DragGAN: Editing the Generative Image Manifold in an Autoregressive Manner

Published: 20 Jul 2024, Last Modified: 21 Jul 2024MM2024 PosterEveryoneRevisionsBibTeXCC BY 4.0
Abstract:

Pixel-level fine-grained image editing remains an open challenge. Previous works fail to achieve an ideal trade-off between control granularity and inference speed. They either fail to achieve pixel-level fine-grained control, or their inference speed requires optimization. To address this, this paper for the first time employs a regression based network to learn the variation patterns of StyleGAN latent codes during the image dragging process. This method enables pixel-level precision in drag editing with little time cost. Users can specify handle points and target points on any GAN-generated images, and our method will move each handle point to its corresponding target point. To achieve this, we decompose the entire movement process into multiple sub-processes. Specifically, we develop a encoder-decoder based network named 'Latent Predictor' to predict the latent code motion trajectories from handle points to target points in an autoregressive manner. Moreover, to enhance the prediction stability, we introduce a component named 'Latent Regularizer', aimed at constraining the latent code motion within the distribution of natural images. Extensive experiments demonstrate that our method achieves state-of-the-art (SOTA) inference speed and image editing performance at the pixel-level granularity.

Primary Subject Area: [Generation] Generative Multimedia
Secondary Subject Area: [Experience] Interactions and Quality of Experience, [Content] Media Interpretation, [Experience] Multimedia Applications
Relevance To Conference: We explore the application of interactive generative multimedia, allowing users to edit images by dragging pixels. Extensive experiments demonstrate that our method achieves state-of-the-art (SOTA) inference speed and image editing performance at the pixel-level granularity.
Supplementary Material: zip
Submission Number: 471
Loading

OpenReview is a long-term project to advance science through improved peer review with legal nonprofit status. We gratefully acknowledge the support of the OpenReview Sponsors. © 2025 OpenReview