Large Language Models for Constrained-Based Causal Discovery

Large Language Models for Constrained-Based Causal Discovery

AAAI 2024 Workshop LLM-CP Submission9 Authors

Published: 14 Dec 2023, Last Modified: 12 Feb 2024LLM-CP @ AAAI 2024 OralEveryoneRevisionsBibTeXCC BY 4.0

Keywords: causality, large language models, causal discovery, conditional independence testing, constrained based causal discovery

TL;DR: We investigate LLMs as oracle for conditional independence testing with application in constrained based causal discovery

Abstract: Causality is essential for understanding complex systems, such as the economy, the brain, and the climate. Constructing causal graphs often relies on either data-driven or expert-driven approaches, both fraught with challenges. The former methods, like the celebrated PC algorithm, face issues with data requirements and assumptions of causal sufficiency, while the latter demand substantial time and expertise. This work explores the capabilities of Large Language Models (LLMs) as an alternative to domain experts for causal graph generation. We frame conditional independence queries as prompts to LLMs and employ the PC algorithm with the answers. The performances of the LLM-based conditional independence oracle on systems with known causal graphs show a high degree of variability. We improve the performance through a proposed statistical-inspired voting schema that allows control over false-positives and false-negatives rates. Finally, we apply the LLM-based PC algorithm to a complex set of variables around food insecurity in the Horn of Africa and find a plausible graph. Inspecting the chain-of-thought argumentation, we occasionally find causal reasoning to justify its answer to a probabilistic query.

Submission Number: 9

Loading