Devil is in the Details: Density Guidance for Detail-Aware Generation with Flow Models

Rafal Karczewski; Markus Heinonen; Vikas K Garg

Devil is in the Details: Density Guidance for Detail-Aware Generation with Flow Models

Rafal Karczewski, Markus Heinonen, Vikas K Garg

Published: 01 May 2025, Last Modified: 18 Jun 2025ICML 2025 posterEveryoneRevisionsBibTeXCC BY 4.0

TL;DR: We analyze the connection between log-density and image detail in flow models and provide tools for detail-aware sampling.

Abstract: Diffusion models have emerged as a powerful class of generative models, capable of producing high-quality images by mapping noise to a data distribution. However, recent findings suggest that image likelihood does not align with perceptual quality: high-likelihood samples tend to be smooth, while lower-likelihood ones are more detailed. Controlling sample density is thus crucial for balancing realism and detail. In this paper, we analyze an existing technique, Prior Guidance, which scales the latent code to influence image detail. We introduce score alignment, a condition that explains why this method works and show that it can be tractably checked for any continuous normalizing flow model. We then propose Density Guidance, a principled modification of the generative ODE that enables exact log-density control during sampling. Finally, we extend Density Guidance to stochastic sampling, ensuring precise log-density control while allowing controlled variation in structure or fine details. Our experiments demonstrate that these techniques provide fine-grained control over image detail without compromising sample quality. Code is available at https://github.com/Aalto-QuML/density-guidance.

Lay Summary: Modern AI systems can generate incredibly realistic images, but controlling the level of detail in these images is surprisingly tricky. Sometimes the images look too smooth or blurry, while other times they're overly complex or even nonsensical. Why does this happen? Our research explores how image detail is tied to something called "likelihood" — a measure of how probable an image is, according to the AI model. We found that images the model considers very likely are often too smooth, while the more detailed and realistic ones tend to be less likely. To fix this, we developed a method called Density Guidance, which gives users precise control over how "likely" or "detailed" the images should be. It works by adjusting the model's internal sampling process in a mathematically principled way. We also extended this method to support randomness, enabling variation in the image’s shape or fine texture — without losing control over quality. This makes it easier for researchers and creators to generate images with just the right amount of detail, opening the door to more reliable and tunable AI-generated content.

Link To Code: https://github.com/Aalto-QuML/density-guidance

Primary Area: Deep Learning->Generative Models and Autoencoders

Keywords: Diffusion models, likelihood, flow matching

Submission Number: 7065

Loading