Abstract: Causal effect estimation is a core task in empirical research and evidence-based decision-making. Successfully performing this task typically requires familiarity with a range of inference methods, their statistical assumptions, and domain-specific considerations. Recent advances in large language models (LLMs) offer the potential to automate the end-to-end causal inference pipeline and broaden access to causality-driven analysis. However, existing LLM-based approaches often require users to specify the estimation method and relevant variables, which needs prior knowledge of causal inference on users end. Similarly, the end-to-end tools support a limited set of causal effect measures, omitting many methods commonly used in applied research.
To address these limitations, we introduce Causal AI Scientist (\model), an end-to-end causal estimation tool that takes a natural language query, maps it to a formal causal estimation problem, selects and implements a suitable method, and interprets the result to answer the original query. \model supports a broad range of causal inference methods, enabling estimation across diverse scenarios. We evaluate \model using examples drawn popular benchmark dataset, academic publications, and synthetic datasets, covering a wide spectrum of causal effect measures and estimation tasks.
Paper Type: Long
Research Area: Language Modeling
Research Area Keywords: applications, chain-of-thought, code models, LLM/AI agents, neurosymbolic approaches, prompting
Contribution Types: NLP engineering experiment, Publicly available software and/or pre-trained models, Data resources
Languages Studied: English
Previous URL: https://openreview.net/forum?id=I7CSkGcvBw
Explanation Of Revisions PDF: pdf
Reassignment Request Area Chair: No, I want the same area chair from our previous submission (subject to their availability).
Reassignment Request Reviewers: No, I want the same set of reviewers from our previous submission (subject to their availability)
A1 Limitations Section: This paper has a limitations section.
A2 Potential Risks: N/A
B Use Or Create Scientific Artifacts: Yes
B1 Cite Creators Of Artifacts: Yes
B1 Elaboration: Section 1
B2 Discuss The License For Artifacts: No
B2 Elaboration: We will release the free licensed data upon acceptance
B3 Artifact Use Consistent With Intended Use: N/A
B4 Data Contains Personally Identifying Info Or Offensive Content: N/A
B5 Documentation Of Artifacts: N/A
B6 Statistics For Data: Yes
B6 Elaboration: Appendix A
C Computational Experiments: Yes
C1 Model Size And Budget: No
C1 Elaboration: Closed Source Models or Well known models used
C2 Experimental Setup And Hyperparameters: N/A
C3 Descriptive Statistics: Yes
C3 Elaboration: Discussed in section 7
C4 Parameters For Packages: Yes
C4 Elaboration: Section 5.2
D Human Subjects Including Annotators: No
D1 Instructions Given To Participants: N/A
D2 Recruitment And Payment: N/A
D3 Data Consent: Yes
D3 Elaboration: Free License Data
D4 Ethics Review Board Approval: N/A
D5 Characteristics Of Annotators: N/A
E Ai Assistants In Research Or Writing: Yes
E1 Information About Use Of Ai Assistants: No
E1 Elaboration: Grammar Check with AI
Author Submission Checklist: yes
Submission Number: 1279
Loading