Keywords: unsupervised learning, representation learning
Abstract: The presence of noise that depends on the latent variable poses a significant identifiability challenge. Addressing this issue, the standard solution in the literature assumes that the observational data satisfy the conditional independence property given the latent variables. However, this assumption might not be valid in practice. This work relaxes this foundational constraint. Specifically, we consider a *generalized dependency structure* in which the observations may exhibit arbitrary dependencies conditional on the latents. To establish identifiability guarantees, we introduce a two-step theoretical framework. First, we formulate the problem as a factor analysis model use perturbation theory to establish the subspace identifiability of the latent variables. Second, assuming the structural sparsity on the mixing function, or sufficient variability constraint in the latent space, we establish component-wise identifiability of each individual latent factor. Using these identifiability results, we develop an unsupervised approach that reliably uncovers the latent representations. Experiments on synthetic and real data verify our theoretical claims.
Primary Area: unsupervised, self-supervised, semi-supervised, and supervised representation learning
Submission Number: 501
Loading