Auditing Black-Box Trends: Structural Inductive Bias Facilitates Causal Interpretability in Clinical Time Series
Presentation Attendance: Yes, we will present in-person
Keywords: Time Series, Foundation Models, Causality, Interpretability, Safety Audit, Clinical AI
TL;DR: We introduce the Causal Hallucination Score (CHS) to audit foundation models for inverse causal semantics, showing that propensity-regularized models recover valid therapeutic signals in confounded clinical data.
Abstract: The deployment of predictive Transformer architectures in high-stakes healthcare presents a critical safety challenge: the divergence between forecasting accuracy and interventional validity. We term this the "Alignment Gap." In observational data, standard training objectives incentivize models to exploit "confounding by indication," often leading to inverted causal semantics. In this work, we present a simple audit protocol for quantifying this gap. We introduce the Causal Hallucination Score (CHS), a metric measuring the divergence between a foundation model's zero-shot counterfactuals and a structural reference instrument. Applying this to Lag-Llama and Chronos-T5, we reveal a severe safety failure: despite high predictive likelihood, naive prompting of these models reflects the dataset's observational bias (associating life-saving vasopressors with increased mortality). We demonstrate that a Propensity-Regularized GRU-D serves as an effective audit instrument, recovering a directionally consistent therapeutic signal (CATE: +0.005) validated by doubly robust estimation and placebo falsification. We release the code, dataset split, and evaluation protocol as a public benchmark to facilitate future safety audits of clinical foundation models.
Track: Research Track (max 4 pages)
Submission Number: 84
Loading