TL;DR: We study the distribution geometries in diffusion models and propose a fine-grained generation rate definition, highly correlated with visual saliency, offering a unified framework for various image manipulation tasks.
Abstract: Building on the manifold hypothesis, which suggests that generative models learn data distributions residing on low-dimensional manifolds, this paper investigates the time-varying manifold sequence induced by the generation process through the lens of differential equations in diffusion models. Our primary contribution is the introduction of the \textit{generation rate}, a novel metric that quantifies local manifold scaling over time. For image data, we show that the accumulated generation rate, referred to as the \textit{generation curve}, strongly correlates with intuitive visual properties, such as the saliency of image components. By leveraging modifications to the generation curves, we propose a unified framework for a range of image manipulation tasks, including semantic transfer, object removal, saliency adjustment, and image blending. Comprehensive evaluations, supported by both the qualitative and quantitative results, highlight the effectiveness of our framework across these diverse tasks.
Primary Area: Deep Learning->Generative Models and Autoencoders
Keywords: Manifold analysis; Diffusion model; Image manipuation
Submission Number: 929
Loading