Pareto Variational Autoencoder

ICLR 2026 Conference Submission19557 Authors

19 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Variational autoencoder, Pareto distribution, Information geometry, Heavy-tail learning, Heavy-tail Modeling
Abstract: Incorporating robustness in generative modeling has enticed many researchers of the field. To this end, we introduce a new class of multivariate power-law distributions---the symmetric Pareto (symPareto) distribution---which can be viewed as an $\ell_1$-norm-based counterpart of the multivariate $t$ distribution. The symPareto distribution possesses many attractive information-geometric properties with respect to the $\gamma$-power divergence that naturally populates power-law families. Leveraging on the joint minimization view of variational inference, we propose the ParetoVAE, a probabilistic autoencoder that minimizes the $\gamma$-power divergence between two statistical manifolds. ParetoVAE employs the symPareto distribution for both prior and encoder, with flexible decoder options including Student's $t$ and symPareto distributions. Empirical evidences demonstrate ParetoVAE's effectiveness across multiple domains through varying the types of the decoder. The $t$ decoder achieves superior performance in sparse, heavy-tailed data reconstruction and word frequency analysis; the symPareto decoder enables robust high-dimensional denoising.
Supplementary Material: zip
Primary Area: probabilistic methods (Bayesian methods, variational inference, sampling, UQ, etc.)
Submission Number: 19557
Loading