Keywords: Causal Representation Learning, Identifiability, Nonlinear ICA
Abstract: Causal representation learning, particularly in the context of nonlinear independent component analysis, aims to uncover the underlying latent variables from observed data, providing critical insights into the true generative processes. However, achieving the identifiability of these latent variables has been an obstacle due to the possibility of infinite spurious solutions. Prior works often rely on auxiliary variable assumptions that enforce conditional independence among latents. However, they require that auxiliary variables not be involved in the mixing function—a constraint that significantly limits the applicability in real-world settings as it is often difficult to obtain suitable label data that can serve as external side information. In this work, we address this challenge by leveraging observable sources as auxiliary variables, a more practical scenario. We also propose a novel framework that selects proper auxiliary variables to improve the recoverability of the latents while ensuring that identifiability conditions are satisfied. To the best of our knowledge, this is the first work to demonstrate identifiability under this setting, offering a more practical solution for causal representation learning. By exploiting the graphical structure of the latent variables, we enhance both identifiability and recoverability, extending the boundaries of current approaches to causal representation learning.
Submission Number: 4
Loading