Back to the basics and to the future: Evaluating silicon samples with POR standards

Yongwei Yang; Gina Walejko

Back to the basics and to the future: Evaluating silicon samples with POR standards

Yongwei Yang, Gina Walejko

Published: 26 Jul 2025, Last Modified: 06 Oct 2025NLPOR 2025EveryoneRevisionsBibTeXCC BY 4.0

Keywords: silicon samples, synthetic survey responses, large language models (LLMs), fidelity, utility, externality, statistical conclusion validity, reproducibility, transparency, public opinion research

TL;DR: Develop consistent standards and practices to evaluate LLM-generated synthetic survey data

Submission Type: Non-Archival

Abstract: Large language models (LLMs) have led to growing interest in using synthetic data for surveys. A growing body of empirical applications suggest a need to apply public opinion research (POR) best practices and standards to the evaluation of such data. To do so, we delineate synthetic data use cases by drawing parallels to survey practices. Next, we emphasize an argument-based approach to efficacy, in which a data generation process is evaluated based on specific arguments around fidelity, utility, and externality. Finally, we stress the need to critically review methodology, especially statistical conclusion validity (SCV), transparency, and reproducibility. This work-in-progress intends to facilitate conversations between computer scientists and survey practitioners by creating an evaluation framework. We intend project outputs to be a collection of open-access and living artifacts and invite others to collaborate.

Submission Number: 15

Loading