Keywords: silicon samples, synthetic survey responses, large language models (LLMs), fidelity, utility, externality, statistical conclusion validity, reproducibility, transparency, public opinion research
TL;DR: Develop consistent standards and practices to evaluate LLM-generated synthetic survey data
Submission Type: Non-Archival
Abstract: Large language models (LLMs) have led to growing interest in using synthetic data for surveys. A growing body of empirical applications suggest a need to apply public opinion research (POR) best practices and standards to the evaluation of such data. To do so, we delineate synthetic data use cases by drawing parallels to survey practices. Next, we emphasize an argument-based approach to efficacy, in which a data generation process is evaluated based on specific arguments around fidelity, utility, and externality. Finally, we stress the need to critically review methodology, especially statistical conclusion validity (SCV), transparency, and reproducibility. This work-in-progress intends to facilitate conversations between computer scientists and survey practitioners by creating an evaluation framework. We intend project outputs to be a collection of open-access and living artifacts and invite others to collaborate.
Submission Number: 15
Loading