A High-Fidelity Text-to-Image Editing Approach: Gaussian Fluid Dynamics in Latent Space

A High-Fidelity Text-to-Image Editing Approach: Gaussian Fluid Dynamics in Latent Space

ICLR 2026 Conference Submission16052 Authors

19 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: text-guided image editing, latent space manipulations, fluid dynamics, Navier–Stokes equations

TL;DR: A Disentangled Text-Guided Image Editing Method with Fluid Dynamics-Based Editing Path Optimization

Abstract: Text-guided image editing with generative models has recently achieved remarkable progress, yet the underlying dynamics of latent space manipulations remain insufficiently explored. In this work, we propose a perspective that models the latent space of generative models as a high-dimensional Gaussian fluid. Specifically, each latent dimension is regarded as a directional axis of the fluid, and the movement of data points within this space is governed by three interacting forces: a driving force that enforces semantic editing objectives, a resistance force that preserves data consistency, and a central constraint that maintains generation quality. We formalize this process through the Navier–Stokes equations, enabling a principled formulation of latent space dynamics as fluid motion under Gaussian density fields. This fluid-inspired framework provides a unified view for balancing editing directionality, structural coherence, and fidelity. We instantiate our approach on StyleGAN2 for text-guided image editing tasks, where preliminary experiments demonstrate its effectiveness in producing semantically accurate, high-quality, and consistent edits compared to conventional latent manipulation methods. Our results suggest that fluid dynamics offers a powerful new paradigm for understanding and guiding latent space transformations in generative models.

Primary Area: generative models

Submission Number: 16052

Loading