Track: long paper (up to 10 pages)
Keywords: causal reasoning, order sensitivity, order invariance, counterfactual
TL;DR: Language models give inconsistent answers when causal facts are reordered; we diagnose a specific such failure with a targeted benchmark and fix it with lightweight fine-tuning.
Abstract: If we accept the statements *(A causes B, B causes C)*, then conclusions we draw from these relations should not depend on the order of presentation. The reordered sequence *(B causes C, A causes B)* describes the same causal graph and should therefore yield identical downstream judgments. We refer to this requirement as *order-invariant causal consistency*. Prior work has shown that language models violate this requirement in a variety of contexts, particularly when asked to reason about hypothetical outcomes.
We introduce a methodology for selective enforcement of causal constraints in language models, and apply it to this problem. We first construct a narrowly targeted diagnostic -- the Textual Causal Invariance Test (TCIT) -- to isolate failures of order-invariant consistency. We then apply a lightweight training procedure that penalizes order-dependent preferences and reinforces order-invariant reasoning.
Implemented on the open-weight Phi-3 model, this intervention raises TCIT accuracy from 59% (modestly above chance) to 98%, without degrading performance on a suite of regression tests. Furthermore, we demonstrate zero-shot transfer to the natural-language CLadder benchmark, yielding statistically significant improvements specifically on Rung-3 (counterfactual) causal reasoning tasks, with no degradation on lower causal rungs.
These results demonstrate that violations of order-invariant causal consistency can be isolated and corrected through targeted enforcement of a single structural constraint. More broadly, they suggest that selectively enforcing well-defined causal principles may provide a practical path toward improving causal reasoning in language models.
Presenter: ~Devon_Copley1
Format: Maybe: the presenting author will attend in person, contingent on other factors that still need to be determined (e.g., visa, funding).
Anonymization: This submission has been anonymized for double-blind review via the removal of identifying information such as names, affiliations, and identifying URLs.
Funding: No, the presenting author of this submission does *not* fall under ICLR’s funding aims, or has sufficient alternate funding.
Submission Number: 110
Loading