VAE-CycleGAN: Variational Latent Representation for Unpaired Image-to-Image Translation

18 Sept 2025 (modified: 11 Feb 2026)Submitted to ICLR 2026EveryoneRevisionsBibTeXCC BY 4.0
Keywords: GAN, cycleGAN, autoencoder, variational autoencoder, VAE, unpaired data, cycle consistency, adversarial, translation, reconstruction, perception, distortion, satellite, maps, aerial, review
TL;DR: We review autoencoder and variational autoencoder (VAE) variants for unpaired image-to-image translation and introduce VAE-CycleGAN to sample the posterior distribution.
Abstract: Image-to-image translation plays a central role in computer vision, enabling applications such as style transfer, domain adaptation, and image enhancement. While recent advances have achieved strong paired translation results, learning mappings in unpaired settings remains challenging. In this work, we present a systematic comparison of autoencoder and variational autoencoder (VAE) variants for unpaired image-to-image translation, using paired data solely as a reference baseline. To capture distributional uncertainty, we introduce VAE-CycleGAN, a unified probabilistic framework that integrates variational inference into the CycleGAN architecture. Our method combines adversarial training and cycle-consistency with a VAE’s probabilistic latent space, allowing the model to approximate the true posterior distribution. Further, the architecture achieves a 256$\times$ spatial compression, efficiently compressing the input into a compact latent representation. Empirical results across the satellite-to-map benchmark dataset demonstrate that VAE-CycleGAN generates high-quality translated images (FID: 67.75) and achieves superior reconstruction fidelity (MSE: 0.0010, PSNR: 29.85 dB, SSIM: 0.7873) comparable to state-of-the-art deterministic approaches without hyperparameter tuning. For summer-to-winter and label-to-cityscape datasets, VAE-CycleGAN performs comparably with state-of-the-art UNSB at 1 step, and is far superior to UNIT-DDPM at 1000 steps, while the determistic AE-CycleGAN is comparable to the 5-step USNB variant.
Primary Area: generative models
Submission Number: 13034
Loading