Keywords: Actual Causation, Causal Vignettes, Structural Causal Models, Large Language Models
TL;DR: We provide a structured collection of vignettes for actual causation, write code for some theories to compute their results on all vignettes, and compare this approach to LLMs.
Abstract: Theories of actual causation provide answers to the question: “Is C a cause of E?” in a specific
scenario. The performance of a new theory is measured by how well its verdicts agree with the
intuitive verdicts of the researcher on particular examples, commonly referred to as vignettes.
This has two drawbacks: First, this is usually done only for a handful of vignettes per theory
since there is no commonly agreed-upon collection of vignettes. That makes it difficult to compare
theories against each other. Second, this evaluation is mostly done by hand. That makes it tedious
for both the researcher proposing a new theory and the reader who tries to assess the merits of the
new theory. To solve this, we provide a comprehensive collection of vignettes in a well-organized
data format. We provide code to load these vignettes and accompanying queries. We also provide an
implementation of two popular theories of causation to demonstrate the advantage of this approach.
In addition, we address the suggestion that LLMs might be more suitable than formal models
of these vignettes to determine causality. To test this claim on current LLMs, we add formulations
of vignettes and queries in natural language. That makes it possible to prompt LLMs for their
verdict and compare their results both with intuitions and the verdicts of particular theories of
actual causation. We find that none of the tested LLMs achieves higher performance than either of
the two implemented theories of causation.
Pmlr Agreement: pdf
Submission Number: 59
Loading