Keywords: Causal Order, Large Language Models, Causal Inference, Causal Discovery
Abstract: At the core of causal inference lies the challenge of recovering the underlying causal graphs based on observational data. Recent work aggregates the results of edge-wise prompts to infer the full graph structure using large language models (LLMs). However, graph structure cannot be identified uniquely using such localized prompts for each edge since the existence of an edge depends on which other nodes are included in the node set. Therefore, we propose a simpler property of the causal graph, the topological order, that can be estimated reliably using localized LLM prompts. Moreover, for downstream tasks like effect inference, the topological order is sufficient to identify causal effect. Hence we leverage Large Language Models (LLMs) as virtual domain experts and propose a novel localized prompt based on triplets to infer the causal order. Acknowledging LLMs' limitations, we also study possible techniques to integrate LLMs with established causal discovery algorithms, including constraint-based and score-based methods. Extensive experiments demonstrate that our approach significantly improves causal ordering accuracy as compared to discovery algorithms, and in turn decreases the error of downstream effect estimation algorithms.
Submission Number: 4
Loading