Abstract: We study the problem of learning the causal structure of a latent tree model only from the observational data. Prior works often assume either a sufficient number of measured variables or the observed variables can only be the child of latent variables (known as measurement assumption). However, they may yield incorrect or uninformative results when some observed variables also cause the latent variable, or when the number of measured variables is less than two. In this paper, we focus on the linear non-Gaussian latent polytree model, where the observed and latent variables can exhibit arbitrary causal dependence and the number of child variables for each latent variable may be only one. By leveraging the non-Gaussianity within the causal model, we introduce rank constraints of high-order cumulants. These constraints align with trek separation within the causal graph and enable the identification of exogenous variables for the relative set. Such properties have intriguing possibilities for identifying the entire latent polytree structure, including not only the number of latent variables but also causal directions. Consequently, we develop an identification algorithm to learn latent polytree by only using the rank constraints of high-order cumulants, and we verify its effectiveness in simulation experiments.
Loading