Keywords: text-guided image editing, latent space manipulations, fluid dynamics, Navier–Stokes equations
TL;DR: A Disentangled Text-Guided Image Editing Method with Fluid Dynamics-Based Editing Path Optimization
Abstract: Text-guided image editing with generative models has recently achieved remarkable progress, yet the underlying dynamics of latent space manipulations remain insufficiently explored. In this work, we propose a perspective that models the latent space of generative models as a high-dimensional Gaussian fluid. Specifically, each latent dimension is regarded as a directional axis of the fluid, and the movement of data points within this space is governed by three interacting forces: a driving force that enforces semantic editing objectives, a resistance force that preserves data consistency, and a central constraint that maintains generation quality. We formalize this process through the Navier–Stokes equations, enabling a principled formulation of latent space dynamics as fluid motion under Gaussian density fields. This fluid-inspired framework provides a unified view for balancing editing directionality, structural coherence, and fidelity. We instantiate our approach on StyleGAN2 for text-guided image editing tasks, where preliminary experiments demonstrate its effectiveness in producing semantically accurate, high-quality, and consistent edits compared to conventional latent manipulation methods. Our results suggest that fluid dynamics offers a powerful new paradigm for understanding and guiding latent space transformations in generative models.
Primary Area: generative models
Submission Number: 16052
Loading