Abstract: This paper presents a novel method for analyzing the latent space geometry of generative models, including statistical physics models and diffusion models, by reconstructing the Fisher information metric. The method approximates the posterior distribution of latent variables given generated samples and uses this to learn the log-partition function, which defines the Fisher metric for exponential families. Theoretical convergence guarantees are provided, and the method is validated on the Ising and TASEP models, outperforming existing baselines in reconstructing thermodynamic quantities. Applied to diffusion models, the method reveals a fractal structure of phase transitions in the latent space, characterized by abrupt changes in the Fisher metric. We demonstrate that while geodesic interpolations are approximately linear within individual phases, this linearity breaks down at phase boundaries, where the diffusion model exhibits a divergent Lipschitz constant with respect to the latent space. These findings provide new insights into the complex structure of diffusion model latent spaces and their connection to phenomena like phase transitions.
Our source code is available at \url{https://github.com/alobashev/hessian-geometry-of-diffusion-models}.
Lay Summary: Diffusion and other generative models learn to create complex data—like images—by moving through an internal “latent” space. However, this latent space is not intuitive and hides surprising geometric features, making it difficult to understand why models sometimes jump between very different outputs. We built a mathematical “map” of that landscape. Borrowing ideas from physics and information geometry, we learn the Fisher metric—a kind of built-in ruler—directly from the images a model produces. Our method works for classic physics simulations (the Ising and TASEP models) and for cutting-edge diffusion generators like Stable Diffusion. When we apply our technique to modern diffusion models, we uncover a striking, fractal-like pattern of sharp “phase transitions” in their latent space. Within each phase, straightforward paths between points work well, but at the boundaries, tiny changes can trigger huge jumps—explaining why these models sometimes behave unpredictably. Our findings pave the way for more reliable and interpretable generative AI.
Link To Code: https://github.com/alobashev/hessian-geometry-of-diffusion-models
Primary Area: Deep Learning->Generative Models and Autoencoders
Keywords: fisher metric, hessian metric, diffusion models, generative models
Submission Number: 12408
Loading