Exploratory Causal Inference in SAEnce

Exploratory Causal Inference in SAEnce

ICLR 2026 Conference Submission22028 Authors

19 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Randomized Controlled Trials, Sparse Auto Encoder, Interpretability, Causal Inference

TL;DR: New method to uncover causal treatment effects directly from trial data using foundation models, SAE and recursive stratification, without any prior and supervision.

Abstract: Randomized Controlled Trials are one of the pillars of science; nevertheless, they rely on hand-crafted hypotheses and expensive analysis. Such constraints prevent causal effect estimation at scale, potentially anchoring on popular yet incomplete hypotheses. We propose to discover the unknown effects of a treatment directly from data. For this, we turn unstructured data from a trial into meaningful representations via pretrained foundation models and interpret them via a Sparse Auto Encoder. However, discovering significant causal effects at the neural level is not trivial due to multiple-testing issues and effects entanglement. To address these challenges, we introduce _Neural Effect Search_, a novel recursive procedure solving both issues by progressive stratification. After assessing the robustness of our algorithm on semi-synthetic experiments, we showcase, in the context of experimental ecology, the first successful unsupervised causal effect identification on a real-world scientific trial.

Supplementary Material: zip

Primary Area: interpretability and explainable AI

Submission Number: 22028

Loading