Riemannian Information Geometry of Variational Autoencoder Latent Spaces: Curvature, Geodesics, and Posterior Collapse Prevention
Keywords: Variational autoencoders, VAE, Riemannian geometry, Fisher information metric, posterior collapse, natural gradient, geodesics, curvature, Langevin dynamics, information geometry
TL;DR: VAE latent space is equipped with Fisher-Riemannian geometry; curvature governs posterior collapse, and geometry-aware training/sampling improves convergence and FID.
Abstract: We develop a comprehensive Riemannian geometric framework for analyzing and improving Variational Autoencoders (VAEs) by equipping the latent space with the Fisher information metric induced by the decoder distribution. Our key insight is that the ELBO optimization landscape is a non-convex Riemannian manifold whose curvature directly governs posterior collapse, mode coverage, and generation quality. We prove three main results: (1) a *curvature-collapse theorem* showing that posterior collapse occurs precisely when the sectional curvature of the latent manifold exceeds a critical threshold $\kappa_c = \frac{1}{2\sigma^2_{\text{decoder}}}$, providing the first geometric characterization of this failure mode; (2) a *natural gradient algorithm* for VAE training that follows geodesics on the Fisher information manifold, achieving 3--5$\times$ faster convergence than Adam with provable convergence to local minima; (3) a *curvature-aware sampling procedure* using Riemannian Langevin dynamics that generates samples along geodesics rather than Euclidean straight lines, improving FID by 15--22\% on standard benchmarks. Experiments on MNIST, CelebA, and CIFAR-10 validate our theoretical predictions and demonstrate that geometry-aware training eliminates posterior collapse without requiring ad-hoc fixes like beta-annealing or free bits.We develop a comprehensive Riemannian geometric framework for analyzing and improving Variational Autoencoders (VAEs) by equipping the latent space with the Fisher information metric induced by the decoder distribution. Our key insight is that the ELBO optimization landscape is a non-convex Riemannian manifold whose curvature directly governs posterior collapse, mode coverage, and generation quality. We prove three main results: (1) a *curvature-collapse theorem* showing that posterior collapse occurs precisely when the sectional curvature of the latent manifold exceeds a critical threshold $\kappa_c = \frac{1}{2\sigma^2_{\text{decoder}}}$, providing the first geometric characterization of this failure mode; (2) a *natural gradient algorithm* for VAE training that follows geodesics on the Fisher information manifold, achieving 3--5$\times$ faster convergence than Adam with provable convergence to local minima; (3) a *curvature-aware sampling procedure* using Riemannian Langevin dynamics that generates samples along geodesics rather than Euclidean straight lines, improving FID by 15--22\% on standard benchmarks. Experiments on MNIST, CelebA, and CIFAR-10 validate our theoretical predictions and demonstrate that geometry-aware training eliminates posterior collapse without requiring ad-hoc fixes like beta-annealing or free bits.
Submission Number: 147
Loading