VAE-CycleGAN: Variational Latent Representation for Unpaired Image-to-Image Translation

ICLR 2026 Conference Submission13034 Authors

18 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: GAN, cycleGAN, autoencoder, variational autoencoder, VAE, unpaired data, cycle consistency, adversarial, translation, reconstruction, perception, distortion, satellite, maps, aerial, review
TL;DR: We review autoencoder and variational autoencoder (VAE) variants for unpaired image-to-image translation and introduce VAE-CycleGAN to sample the posterior distribution.
Abstract: Image-to-image translation plays a central role in computer vision, enabling applications such as style transfer, domain adaptation, and image enhancement. While recent advances have achieved strong paired translation results, learning mappings in unpaired settings remains challenging. In this work, we present a systematic comparison of autoencoder and variational autoencoder (VAE) variants for unpaired image-to-image translation, using paired data solely as a reference baseline. To capture distributional uncertainty, we introduce VAE-CycleGAN, a unified probabilistic framework that integrates variational inference into the CycleGAN architecture. Our method combines adversarial training and cycle-consistency with a VAE’s probabilistic latent space, allowing the model to approximate the true posterior distribution. Further, the architecture achieves a 256x spatial compression, efficiently compressing the input into a compact latent representation. Empirical results across the satellite-to-map benchmark dataset demonstrate that VAE-CycleGAN generates high-quality translated images (FID: 69.25, KID: 0.0378) and achieves superior reconstruction fidelity (MSE: 0.0011, PSNR: 29.67 dB, SSIM: 0.7804) comparable to state-of-the-art deterministic approaches without hyperparameter tuning.
Primary Area: generative models
Submission Number: 13034
Loading