Generative Model via Quantile Assignment

ICLR 2026 Conference Submission25359 Authors

20 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: generative models, quantile assignment, optimal transportation, latent representation learning, synthetic data generation
Abstract: Deep Generative models (DGMs) play two central roles in modern machine learning: (i) producing new information (e.g., image synthesis, data augmentation, and creative content generation) and (ii) reducing dimensionality (by deriving low-dimensional latent representations). Yet, DGMs' versatility must confront training difficulty. Both information generation and dimension reduction using DGMs require learning the distribution. While deep neural networks (DNNs) are a natural choice for parameterizing generators, there is no universally reliable method for learning compact latent representations. As a compromise, current approaches rely on introducing an additional DNN: (i) variational autoencoders (VAEs), which map data into latent variables through an encoder, and (ii) generative adversarial networks (GANs), which employ a discriminator in an adversarial framework. Learning two DNNs simultaneously, however, introduces conceptual and practical difficulties. Conceptually, there is no guarantee that such an encoder/discriminator exists, especially in the form of a DNN. In practice, training encoders/discriminators on high-dimensional inputs can be more data-hungry and unstable than training a generator on low-dimensional latents (whereas generators usually take low-dimensional latent data as input). Moreover, training multiple DNNs jointly is unstable, particularly in GANs, leading to convergence issues such as mode collapse. Here, we introduce NeuroSQL, a DGM that learns low-dimensional latent representations without an encoder. Specifically, NeuroSQL learns the latent variables implicitly by solving a linear assignment problem, then passes the latent information to a unique generator. To demonstrate NeuroSQL's efficacy, we benchmark its performance against GANs, VAEs, and a budget-matched diffusion baseline on three independent datasets on faces from the Large-Scale CelebFaces Attributes Dataset (CelebA), animal faces from Animal Faces HQ (AFHQ), and brain images from the Open Access Series of Imaging Studies (OASIS). Compared to VAEs, GANs, and diffusion models within our experimental setup, (1) in terms of image quality, achieves overall lower mean pixel distance between synthetic and true images and stronger perceptual/structural fidelity, under the same computational setting; (2) computationally, NeuroSQL requires the least amount of training time; and (3) practically, NeuroSQL provides an effective solution for generating synthetic data when there are limited training data (e.g., neuroimaging data with a higher-dimensional feature space than the sample size). Taken together, by embracing quantile assignment instead of an encoder, NeuroSQL presents us a fast, stable, and robust way to generate synthetic data with minimal information loss.
Primary Area: generative models
Submission Number: 25359
Loading