The Prompt Is the Analytic Choice: Specification Curve Analysis for LLM-Based Social Science

Published: 25 May 2026, Last Modified: 25 May 2026CTB@ICML 2026EveryoneRevisionsBibTeXCC BY 4.0
Keywords: specification curve analysis, prompt sensitivity, LLM evaluation, variance decomposition, reproducibility, silicon sampling, permutation inference, computational social science, researcher degrees of freedom, ANES
TL;DR: We introduce P-SCA, a specification-curve framework that maps 2,592 LLM prompt variants; partisan signals survive on every ANES item, but question framing dominates gun control variance and a single model×framing combination inverts the result.
Abstract: Large language models are widely used as synthetic survey respondents, yet the prompts that elicit their responses rest on choices of model, persona, framing, system register, temperature, and few-shot count that go undisclosed. This carries the analytic-flexibility problem of the credibility revolution into the elicitation stage. We develop Prompt Specification Curve Analysis (P-SCA), which enumerates defensible prompts, decomposes response variance with $\eta^2$, and tests dimension dominance via Fisher $r$-to-$z$. Applied to a 2,592-cell multiverse across six LLMs with 600 specifications on three 2024 ANES items, P-SCA shows that the partisan signal survives on every item ($p < 0.0001$; 95%, 95%, 83% directional consistency), though sensitivity is topic-contingent. Question framing accounts for 2.5 times more variance than any other dimension on gun control ($\eta^2 = 0.160$ versus $0.065$ for model; $z = +2.65$, $p = 0.008$), while model dominates the others. A permutation-derived coverage threshold near 49% is exceeded by 34 to 46 percentage points in observed coverage, and LLM partisan gaps exceed ANES 2024 by 1.71 to 2.17 on well-posed items (jackknife CIs exclude unity). We propose a Prompt Specification Reporting Standard for LLM-based research.
Paper Type: Long (8 pages)
Email Sharing: We authorize the sharing of all author emails with Program Chairs.
Data Release: We authorize the release of our submission and author names to the public in the event of acceptance.
Submission Number: 14
Loading