Interventional Grounding Audits: Black-Box Premise-Dependency Tests for LLM Chain-of-Thought via Predicate Substitution
Track: tiny / short paper (up to 4 pages)
Keywords: chain-of-thought, premise dependency, interventional audit, causal intervention, faithfulness, ProntoQA, logical reasoning, evaluation, reproducibility
TL;DR: A black-box, step-level premise-dependency audit for chain-of-thought using predicate substitution and canonicalized conclusions, revealing “right answer, wrong reasoning” on ProntoQA.
Abstract: Large language models produce chain-of-thought (CoT) reasoning that appears logically sound yet may not genuinely depend on its stated premises. We introduce interventional grounding audits, a black-box, step-level test of premise dependency: we intervene on a single premise by substituting its target predicate with a fresh symbol, re-run the model, and check whether each reasoning step's normalized conclusion (canonical predicate form) changes. We evaluate on ProntoQA, a synthetic multi-hop deductive reasoning benchmark with gold proof trees, where step-level premise dependencies are known. Applied to 50 ProntoQA problems with GPT-4o, our method achieves F1 = 0.783 on detecting proof-tree dependencies (F1 = 0.835 on predicate-determining dependencies; Recall = 97.4%), significantly outperforming a self-consistency baseline (F1 = 0.346; 95% bootstrap CIs non-overlapping). We further identify that 28% of correctly-solved problems contain at least one step insensitive to proof-tree premises—a "right answer, wrong reasoning" phenomenon invisible to passive methods. All audit certificates, raw outputs, and reproduction scripts are included as supplementary material, and we discuss scope limits beyond formal, parsable benchmarks.
Anonymization: This submission has been anonymized for double-blind review via the removal of identifying information such as names, affiliations, and identifying URLs.
Funding: Yes, the presenting author of this submission falls under ICLR’s funding aims, and funding would significantly impact their ability to attend the workshop in person.
Submission Number: 72
Loading