Causal AI Assistant: Facilitating Causal Data Science with Large Language Models

Sawal Acharya; Vishal Verma; Samuel Simko; Anahita Haghighat; Devansh Bhardwaj; Dominik Janzing; Mrinmaya Sachan; Bernhard Schölkopf; Zhijing Jin

Causal AI Assistant: Facilitating Causal Data Science with Large Language Models

Sawal Acharya, Vishal Verma, Samuel Simko, Anahita Haghighat, Devansh Bhardwaj, Dominik Janzing, Mrinmaya Sachan, Bernhard Schölkopf, Zhijing Jin

Published: 31 Jul 2025, Last Modified: 15 Aug 2025LM4SciEveryoneRevisionsBibTeXCC BY 4.0

Keywords: automated causal data analysis, empirical research, quantitative social sciences, LLMs and causality

Abstract: Causal effect estimation is an integral component of empirical research and evidence-based decision-making. The process of estimating causal effects requires knowledge of a wide range of inference methods, their underlying statistical assumptions, and a technical understanding of the domain. This reliance on expert knowledge can be a limiting factor for non-experts. Recent advances in LLMs can widen the accessibility to causality-driven analysis by automating the estimation pipeline. Current LLM-powered approaches require users to specify the method and variables or perform tests on examples that cover a small set of estimation techniques. To address these limitations, we introduce Causal AI Assistant (CAIA), an end-to-end LLM-powered causal estimation tool that automatically selects appropriate inference methods, implements them, and validates the results. We evaluate our approach on diverse cases drawn from textbooks, published empirical studies, and synthetic examples, demonstrating its ability to handle a wide variety of causal estimation scenarios.

Submission Number: 23

Loading