Causal Discovery for Linear Mixed Data

Published: 09 Feb 2022, Last Modified: 05 May 2023CLeaR 2022 PosterReaders: Everyone
Keywords: causal discovery, structural causal models, mixed data, identifiability
TL;DR: We provide sufficient identifiability conditions for causal discovery with linear mixed data in bivariate as well as multivariate cases.
Abstract: Discovery of causal relationships from observational data, especially from mixed data that consist of both continuous and discrete variables, is a fundamental yet challenging problem. Traditional methods focus on polishing the data type processing policy, which may lose data information. Compared with such methods, the constraint-based and score-based methods for mixed data derive certain conditional independence tests or score functions from the data's characteristics. However, they may return the Markov equivalence class due to the lack of identifiability guarantees, which may limit their applicability or hinder their interpretability of causal graphs. Thus, in this paper, based on the structural causal models of continuous and discrete variables, we provide sufficient identifiability conditions in bivariate as well as multivariate cases. We show that if the data follow our proposed restricted Linear Mixed causal model (LiM), such a model is identifiable. In addition, we proposed a two-step hybrid method to discover the causal structure for mixed data. Experiments on both synthetic and real-world data empirically demonstrate the identifiability and efficacy of our proposed LiM model.
