Neuro-Symbolic Active Causal Hypothesis Testing for NAD+-Centered Alzheimer's Disease Reversal

David Scott Lewis; Enrique Zueco

Neuro-Symbolic Active Causal Hypothesis Testing for NAD+-Centered Alzheimer's Disease Reversal

David Scott Lewis, Enrique Zueco

Published: 08 Mar 2026, Last Modified: 30 Apr 2026ICLR 2026 Workshop LLM ReasoningEveryoneRevisionsBibTeXCC BY 4.0

Track: long paper (up to 10 pages)

Keywords: active causal hypothesis testing, neuro-symbolic causal discovery, LLM structural priors, NOTEARS, structural causal models, symbolic mechanistic constraints, constraint verification, Bayesian optimal experimental design, active intervention selection, NAD+ homeostasis, Alzheimer’s disease reversal, P7C3-A20 intervention, causal graph recovery, edge F1, structural Hamming distance, ODE simulation, falsifiable hypotheses, hypothesis generation agents, mechanistic biology, audit trail, active learning

TL;DR: ACHT combines LLM hypothesis agents, causal discovery, and symbolic constraints for active causal graph learning in NAD+-centered AD reversal. Achieves 0.90 edge-F1, satisfies 6/6 constraints, and predicts P7C3-A20 intervention directions.

Abstract: Large language models (LLMs) generate fluent scientific narratives but frequently produce unfalsifiable, mechanistically inconsistent causal claims—a critical failure mode in biomedical reasoning. We introduce Active Causal Hypothesis Testing (ACHT), a neuro-symbolic framework that integrates LLM agents for hypothesis generation, differentiable causal discovery for structure learning, and symbolic verification for mechanistic constraint enforcement. We evaluate ACHT on the biologically grounded task of NAD+-centered Alzheimer’s disease (AD) reversal, leveraging recent demonstrations of pharmacologic reversal of advanced AD phenotypes via NAD+ homeostasis restoration. In retrospective evaluation against a 12-node, 16-edge ground-truth causal graph encoding established NAD+/AD biology, ACHT achieves an edge F1 of 0.90, satisfies 6/6 mechanistic constraints, and correctly predicts all 8 directional outcomes of P7C3-A20 intervention. In prospective ODE simulation, ACHT’s Bayesian active selection converges to lower structural Hamming distance than random or entropy-based baselines. Ablation reveals that removing LLM priors degrades F1 by 0.29 (from 0.84 to 0.55), while removing symbolic verification reduces constraint satisfaction by 16% (relative). Our results demonstrate that verifiable, constraint-aware reasoning—not narrative plausibility—should be the standard for AI-driven scientific hypothesis generation.

Anonymization: This submission has been anonymized for double-blind review via the removal of identifying information such as names, affiliations, and identifying URLs.

Funding: No, the presenting author of this submission does *not* fall under ICLR’s funding aims, or has sufficient alternate funding.

Submission Number: 143

Loading