Keywords: Unsupervised Generative Model, VAE, log hyperbolic cosine loss
TL;DR: We propose to train VAE with a new reconstruction loss, the log hyperbolic cosine (log-cosh) loss, which can significantly improve the performance of VAE and its variants in output quality, measured by sharpness and FID score.
Abstract: In Variational Auto-Encoder (VAE), the default choice of reconstruction loss function between the decoded sample and the input is the squared $L_2$. We propose to replace it with the log hyperbolic cosine (log-cosh) loss, which behaves as $L_2$ at small values and as $L_1$ at large values, and differentiable everywhere. Compared with $L_2$, the log-cosh loss improves the reconstruction without damaging the latent space optimization, thus automatically keeping a balance between the reconstruction and the generation. Extensive experiments on MNIST and CelebA datasets show that the log-cosh reconstruction loss significantly improves the performance of VAE and its variants in output quality, measured by sharpness and FID score. In addition, the gradient of the log-cosh is a simple tanh function, which makes the implementation of gradient descent as simple as adding one sentence in coding.