Multi-Rate VAE: Train Once, Get the Full Rate-Distortion Curve

Juhan Bae; Michael R. Zhang; Michael Ruan; Eric Wang; So Hasegawa; Jimmy Ba; Roger Baker Grosse

Multi-Rate VAE: Train Once, Get the Full Rate-Distortion Curve

Juhan Bae, Michael R. Zhang, Michael Ruan, Eric Wang, So Hasegawa, Jimmy Ba, Roger Baker Grosse

Published: 01 Feb 2023, Last Modified: 22 Jun 2025ICLR 2023 notable top 5%Readers: Everyone

Keywords: Variational Autoencoders, VAEs, Hypernetworks, Response Functions, Hyperparameter Tuning

Abstract: Variational autoencoders (VAEs) are powerful tools for learning latent representations of data used in a wide range of applications. In practice, VAEs usually require multiple training rounds to choose the amount of information the latent variable should retain. This trade-off between the reconstruction error (distortion) and the KL divergence (rate) is typically parameterized by a hyperparameter $\beta$. In this paper, we introduce Multi-Rate VAE (MR-VAE), a computationally efficient framework for learning optimal parameters corresponding to various $\beta$ in a single training run. The key idea is to explicitly formulate a response function using hypernetworks that maps $\beta$ to the optimal parameters. MR-VAEs construct a compact response hypernetwork where the pre-activations are conditionally gated based on $\beta$. We justify the proposed architecture by analyzing linear VAEs and showing that it can represent response functions exactly for linear VAEs. With the learned hypernetwork, MR-VAEs can construct the rate-distortion curve without additional training and can be deployed with significantly less hyperparameter tuning. Empirically, our approach is competitive and often exceeds the performance of multiple $\beta$-VAEs training with minimal computation and memory overheads.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics

Submission Guidelines: Yes

Please Choose The Closest Area That Your Submission Falls Into: Deep Learning and representational learning

TL;DR: MR-VAEs can construct the rate-distortion curve in a single training run.

Supplementary Material: zip

Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 4 code implementations](https://www.catalyzex.com/paper/multi-rate-vae-train-once-get-the-full-rate/code)

13 Replies

Loading