Towards Evaluating the Representation Learned by Variational AutoEncoders

Tatsuya Ueda, Danilo Vasconcellos Vargas

2021 (modified: 10 Nov 2022)SICE 2021Readers: Everyone

Abstract: At the heart of a deep neural network is representation learning with complex latent variables. This representation learning has been improved by disentangled representations and the idea of regularization terms. However, adversarial samples show that tasks with DNNs can easily fail due to slight perturbations or transformations of the input. Variational AutoEncoder (VAE) learns <tex xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">$P(z\vert x)$</tex> , the distribution of the latent variable <tex xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">$z$</tex> , rather than <tex xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">$P(y\vert x)$</tex> , the distribution of the output <tex xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">$y$</tex> for the input x. Therefore, VAE is considered to be a good model for learning representations from input data. In other words, the mapping of <tex xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">$x$</tex> is not directly to <tex xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">$y$</tex> , but to the latent variable <tex xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">$z$</tex> . In this paper, we propose an evaluation method to characterize the latent variables that VAE learns. Specifically, latent variables extracted from VAEs trained by two well-known data sets are analyzed by the k-nearest neighbor method(kNN). In doing so, we propose an interpretation of what kind of representation the VAE learns, and share clues about the hyperdimensional space to which the latent variables are mapped.

0 Replies