Keywords: Generative Models, Stochastic Interpolants, Flow Models
TL;DR: Novel formulation for joint training with stochastic interpolants in a latent space enabling simultaneous learning of encoder, decoder and latent space generative model.
Abstract: Stochastic Interpolants (SI) are a powerful framework for generative modeling, capable of flexibly transforming between two probability distributions. However, their use in jointly optimized latent variable models remains unexplored as they require direct access to the samples from the two distributions. This work presents Latent Stochastic Interpolants (LSI) enabling joint learning in a latent space with end-to-end optimized encoder, decoder and latent SI models. We achieve this by developing a principled Evidence Lower Bound (ELBO) objective derived directly in continuous time.
The joint optimization allows LSI to learn effective latent representations along with a generative process that transforms an arbitrary prior distribution into the encoder-defined aggregated posterior.
LSI sidesteps the simple priors of the normal diffusion models and mitigates the computational demands of applying SI directly in high-dimensional observation spaces, while preserving the generative flexibility of the SI framework.
We demonstrate the efficacy of LSI through comprehensive experiments on the standard large scale ImageNet generation benchmark.
Primary Area: generative models
Submission Number: 19694
Loading