CausalSim: Counterfactual Implication Inversion as a Logical Consistency Stress Test for Large Language Models

Published: 08 Mar 2026, Last Modified: 08 Mar 2026ICLR 2026 Workshop LLM ReasoningEveryoneRevisionsBibTeXCC BY 4.0
Track: long paper (up to 10 pages)
Keywords: logical reasoning, large language models, causal reasoning, logical consistency, benchmark, evaluation
TL;DR: We introduce CausalSim, a benchmark that evaluates logical reasoning robustness in large language models by testing implication direction consistency under counterfactual inversion.
Abstract: Large language models (LLMs) achieve strong performance on reasoning benchmarks, yet their structural logical consistency remains insufficiently understood. In particular, it is unclear whether models preserve valid implication direction when logical structure is minimally inverted while surface semantics remain nearly identical. We introduce CausalSim, a benchmark for evaluating counterfactual directional consistency as a stress test of logical reasoning in LLMs. The benchmark consists of paired implication hypotheses (A → B vs. B → A) that isolate sensitivity to implication reversal as a minimal structural perturbation. We propose two evaluation metrics: the Causal Advantage Index (CAI), measuring performance asymmetry under inversion, and Balanced-CAI, capturing cross-prompt logical consistency beyond raw accuracy. Across six instruction-tuned LLMs, we observe systematic implication-direction asymmetries, demonstrating that high forward-direction accuracy does not guarantee structural logical robustness. Our findings position implication inversion as a minimal yet diagnostic probe of logical reasoning reliability in modern LLMs.
Anonymization: This submission has been anonymized for double-blind review via the removal of identifying information such as names, affiliations, and identifying URLs.
Submission Number: 157
Loading