Keywords: Causal inference, Foundation Models
Abstract: Synthetic control (SC) is a widely used method for estimating causal effects from observational panel data, with classical approaches expressing a target trajectory as a linear combination of donor trajectories. Recent advances in foundation models pretrained on synthetic data have shown strong performance in structured, sample-limited regimes such as tabular and time-series prediction, raising the question of whether such models are also effective for synthetic control. We conduct a large-scale empirical study of traditional and foundation model approaches on over 300K simulated panels, spanning fully linear to fully nonlinear state-space dynamics, varying noise regimes, and a range of panel sizes and ranks. Comparing three foundation models (TabPFN, TabPFN-TS, and Chronos) against standard SC baselines (Robust SC, Lasso, and Simplex), we identify regimes where foundation models outperform classical baselines, including when latent dynamics are nonlinear and panels are high rank. Linear methods remain competitive or superior in low-rank and near-linear settings, with Simplex providing a reliable baseline across our testbed. These results suggest that foundation models pretrained on synthetic data are a promising direction for synthetic control in challenging regimes, and we release our benchmarks and analysis to support future work.
Submission Number: 151
Loading