Optimal estimation of linear non-Gaussian structure equation models
TL;DR: This study establishes a LiNGAM learning algorithm using distance covariance that achieves the optimal sample complexity, $n = \Theta(d_{in} \log \frac{p}{d_{in}})$, without assuming faithfulness or a known indegree.
Abstract: Much of science involves discovering and modeling causal relationships in nature. Significant progress has been made in developing statistical methods for representing and identifying causal knowledge from data using Linear Non-Gaussian Acyclic Models (LiNGAMs). Despite successes in learning LiNGAMs across various sample settings, the optimal sample complexity for high-dimensional LiNGAMs remains unexplored. This study establishes the optimal sample complexity for learning the structure of LiNGAMs under a sub-Gaussianity assumption. Specifically, it introduces a structure recovery algorithm using distance covariance that achieves the optimal sample complexity, $n = \Theta(d_{in} \log \frac{p}{d_{in}})$, without assuming faithfulness or a known indegree. The theoretical findings and superiority of the proposed algorithm compared to existing algorithms are validated through numerical experiments and real data analysis.
Submission Number: 268
Loading