Unifying Categorical Models by Explicit Disentanglement of the Labels' Generative FactorsDownload PDF

Published: 28 Jan 2022, Last Modified: 13 Feb 2023ICLR 2022 SubmittedReaders: Everyone
Keywords: Disentanglement, explainability, latent representation
Abstract: In most machine learning tasks, the datasets are mainly annotated by categorical labels. For example, in emotion recognition, most datasets rely only on categorical labels, such as ``happy'' and ``sad''. Usually, different datasets use different labelling systems (e.g., different number of categories and different names), even when describing the same data attributes. As a consequence, only a small subset of all the available datasets can be used for any supervised learning task, since the labelling systems used in the training data are not compatible with each other. In this paper, we propose a \emph{multi-type continuous disentanglement variational autoencoder} to address this problem by identifying and disentangling the true dimensional generative factors that determine each categorical label. By doing so, it is possible to merge multiple datasets based on different categorical models by projecting the data points into a unified latent space. The experiments performed on synthetic datasets show a perfect correlation between the disentangled latent values and the true generative factors. Also, by observing the displacement of each label's explicit distributions, we noticed that the encoded space is a simple affine transformation of the generative factors' space. As the latent structure can be autonomously learnt by the model, and each label can be explicitly decomposed in its generative factors, this framework is very promising for further exploring explainability in new and existing neural networks architectures.
9 Replies

Loading