Latent Shattering: Turning Unconditional Pretrained Generators Into Conditional Models By Imposing Latent Structure

24 Sept 2023 (modified: 11 Feb 2024)Submitted to ICLR 2024EveryoneRevisionsBibTeX
Primary Area: generative models
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Keywords: generative models, generative modeling, GANs, VAEs
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.
TL;DR: We develop an unsupervised method to convert pretrained unconditional generators into conditional models, without finetuning or classifier feedback
Abstract: Deep generative models, such as GANs and VAEs, have gained substantial attention for their ability to synthesize realistic data. Pretrained generative models are often unconditional, thus do not easily allow the user to specify the class of the output. Yet supporting conditional generation offers inherent benefits for many tasks. Due to current models requiring huge data sets and often prohibitively expensive computational resources for training, it is desirable to have a lightweight method that can convert pretrained unconditional generators into conditional models without retraining. Previous research into this problem is limited, typically assuming either access to classifiers that identify which regions of the generator’s latent space correspond to specific classes, access to labeled data, or even retraining of the generative model itself. These strict requirements pose a serious limitation. In this work, we propose LASH, a fresh approach at the conversion of unconditional generators into conditional models in a completely unsupervised manner without requiring retraining nor access to any real data. Instead, the key principle of LASH is to identify points in the generator’s latent space that are mapped to low-density regions of the output space. The insight is that by removing these points, LASH “shatters” the latent space into distinct clusters where each cluster corresponds to a semantically meaningful mode in the output space. We demonstrate that these modes correspond to distinct real-world classes. Lastly, LASH utilizes a simple Gaussian mixture model to adaptively sample from these clusters, supporting unsupervised conditional generation. Through a series of experiments on MNIST, FashionMNIST, and CelebA, we demonstrate that LASH significantly outperforms existing methods in unsupervised conditional sampling.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 8799
Loading