Position: The Identification Crisis in LLM Social Simulation is a Rank-Constrained Mechanism Decomposition Problem

09 May 2026 (modified: 09 May 2026)ICML 2026 Workshop CoLoRAI SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: LLM social simulation, identification, rank decomposition, observational equivalence, low-rank representations, mechanism decomposition, position paper
TL;DR: LLM social simulation outputs are a rank-3 mechanism superposition (training prior + prompt compliance + emergence); reported predictive fidelity is rank-1; the field needs rank-restoring design contrasts to identify mechanisms.
Abstract: This position paper argues that the current evidentiary crisis in LLM-based social simulation is most precisely understood as a rank-constrained decomposition problem: the observed outcome from a simulation is a superposition of contributions from training-prior retrieval, prompt-induced role compliance, and genuine interactional emergence, and the published evidence is rank-deficient with respect to separating these components. Recent work reports high predictive fidelity (r=0.85 across 476 effects), while a parallel critical literature shows synthetic respondents fail regression, prompt-sensitivity, and temporal-stability tests. We frame this disagreement as a non-uniqueness phenomenon: in the absence of structural rank constraints on the joint mechanism representation, multiple decompositions of the same observation matrix are equally consistent with the data, exactly the way that an unconstrained matrix factorization admits infinitely many factor pairs. The fix is conceptual before it is technical: simulations become evidence only when the design imposes enough rank structure (placebo conditions, ablation grids, cross-model factors) that the mechanism factors become uniquely separable. We audit the principal LLM social-simulation literature through this rank-constrained decomposition lens, formalize a six-item identification checklist as a set of rank-restoring design moves, and argue that the low-rank-representations community is uniquely positioned to give this evidentiary problem the formal apparatus it currently lacks.
Submission Number: 144
Loading