LLM-Driven Causal Discovery via Harmonized Prior

Published: 01 Jan 2025, Last Modified: 06 Aug 2025IEEE Trans. Knowl. Data Eng. 2025EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Traditional domain-specific causal discovery relies on expert knowledge to guide the data-based structure learning process, thereby improving the reliability of recovered causality. Recent studies have shown promise in using the Large Language Model (LLM) as causal experts to construct autonomous expert-guided causal discovery systems through causal reasoning between pairwise variables. However, their performance is hampered by inaccuracies in aligning LLM-derived causal knowledge with the actual causal structure. To address this issue, this paper proposes a novel LLM-driven causal discovery framework that limits LLM’s prior within a reliable range. Instead of pairwise causal reasoning that requires both precise and comprehensive output results, the LLM is directed to focus on each single aspect separately. By combining these distinct causal insights, a unified set of structural constraints is created, termed a harmonized prior, which draws on their respective strengths to ensure prior accuracy. On this basis, we introduce plug-and-play integrations of the harmonized prior into mainstream categories of structure learning methods, thereby enhancing their applicability in practical scenarios. Evaluations on real-world data demonstrate the effectiveness of our approach.
Loading