Interpretable Latent Distributions Using Space-Filling Curves

20 Sept 2023 (modified: 11 Feb 2024)Submitted to ICLR 2024EveryoneRevisionsBibTeX
Primary Area: visualization or interpretation of learned representations
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Keywords: Interpretable Latent Space, Discovering Interpretable Directions, Generative Adversarial Networks, Image Editing, Space-Filling Curve, Vector Quantization
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.
TL;DR: This paper explains how our newly proposed space-fillling vector quantizer tool helps discovering the underlying morphological structure in the latent space and its interpertable directions.
Abstract: Deep generative models are well-known neural network-based architectures that learn a latent distribution whose samples can be mapped to sensible real-world data such as images, video, and speech. Such latent distributions are, however, often difficult to interpret. In generative adversarial networks (GANs), some earlier supervised methods aim to create an interpretable (structured) latent distribution or discover interpretable directions for image editing which require exploiting the data labels or annotated synthesized samples during training, respectively. In contrast, we propose using an unsupervised structured distribution modeling technique that incorporates space-filling curves into vector quantization, which makes the latent distribution interpretable by capturing its underlying morphological structure. We apply this technique to model the latent distribution of pretrained StyleGAN2 and BigGAN networks on various image datasets. Our experiments show that the proposed approach yields an interpretable model of the latent distribution such that it determines which part of the latent distribution corresponds to specific generative factors such as age, pose, hairstyle, background, data class, etc. Furthermore, we can use the points and direction of a space-filling line for controllable data augmentation and applying intelligible image transformations, respectively. The implementation of our proposed method is publicly available.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 2555
Loading