Expressive Score-Based Priors for Distribution Matching with Geometry-Preserving Regularization

Published: 01 May 2025, Last Modified: 18 Jun 2025ICML 2025 posterEveryoneRevisionsBibTeXCC BY 4.0
TL;DR: We propose a new VAE-based distribution matching method that uses (1) an expressive score-based prior and (2) a geometric-preserving regularization inspired by Gromov-Wasserstein.
Abstract: Distribution matching (DM) is a versatile domain-invariant representation learning technique that has been applied to tasks such as fair classification, domain adaptation, and domain translation. Non-parametric DM methods struggle with scalability and adversarial DM approaches suffer from instability and mode collapse. While likelihood-based methods are a promising alternative, they often impose unnecessary biases through fixed priors or require explicit density models (e.g., flows) that can be challenging to train. We address this limitation by introducing a novel approach to training likelihood-based DM using expressive score-based prior distributions. Our key insight is that gradient-based DM training only requires the prior's score function---not its density---allowing us to train the prior via denoising score matching. This approach eliminates biases from fixed priors (e.g., in VAEs), enabling more effective use of geometry-preserving regularization, while avoiding the challenge of learning an explicit prior density model (e.g., a flow-based prior). Our method also demonstrates better stability and computational efficiency compared to other diffusion-based priors (e.g., LSGM). Furthermore, experiments demonstrate superior performance across multiple tasks, establishing our score-based method as a stable and effective approach to distribution matching. Source code available at https://github.com/inouye-lab/SAUB.
Lay Summary: Machine learning systems often need to work fairly across different groups of people or adapt to new environments, but current methods struggle with this. For example, when training AI to recognize images, we want it to work equally well for all demographic groups or transfer knowledge from one type of data to another. Existing approaches either do not scale to large datasets, are unstable during training, or make overly restrictive assumptions that hurt performance. This paper developed a new method that learns flexible "score-based priors" - essentially teaching the AI system to understand the underlying patterns in data without imposing rigid assumptions by learning the "gradient" or directional information from the dataset. This paper also added geometry-preserving constraints that help maintain meaningful relationships in the data, like ensuring that similar images stay close together in the AI's internal representation. This approach makes AI systems more stable, computationally efficient, and better at tasks requiring fairness and adaptability. The method showed superior performance in fair classification (ensuring equal treatment across groups), domain adaptation (applying knowledge from one dataset to another), and image translation tasks. This could lead to more trustworthy AI systems that work fairly across diverse populations and adapt better to new environments.
Link To Code: https://github.com/inouye-lab/SAUB
Primary Area: General Machine Learning->Unsupervised and Semi-supervised Learning
Keywords: Distribution Matching, Latent Representation, Score-Based Models, Diffusion Models
Submission Number: 10162
Loading