Keywords: Generative Modeling, Flow matching, Image Generation, Optimal Transport, Riemannian Flow Matching, Manifold Learning
Abstract: Recent advances in image generation, including diffusion models and flow matching, have achieved remarkable success through mathematical foundations. Furthermore, when the underlying data manifold is known, geometry-aware generative models leveraging differential geometric tools have demonstrated superior performance by exploiting intrinsic geometric structure. However, natural images lack explicit geometric priors, forcing existing methods to operate solely in high-dimensional Euclidean space despite potential geometric constraints in the data. In this work, we investigate the underlying geometric structure of natural images and introduce geometry-aware image flow matching methods. Through directional decomposition analysis, we observe that the majority of semantic information in images is encoded in their directional components, while scalar components can be effectively approximated by global dataset means with minimal impact on quality. This property appears not only in RGB space, but also extends to various latent spaces, indicating that natural images can be generally modeled as points on a hypersphere. Building on this insight, we introduce geometry-aware image flow matching: Spherical Optimal Transport Flow Matching (SOT-CFM), which leverages angular distance metrics, and Spherical Riemannian Flow Matching (S-RFM), which constrains dynamics directly on the hypersphere. Experiments on CIFAR-10 and ImageNet confirm that our spherical methods outperform their Euclidean counterparts, paving the way for future advances in geometry-aware image generative modeling.
Primary Area: generative models
Submission Number: 2946
Loading