More Parameters Than Populations: A Systematic Review of Large Language Models in Survey Research

Published: 26 Jul 2025, Last Modified: 06 Oct 2025NLPOR 2025EveryoneRevisionsBibTeXCC BY 4.0
Keywords: Large Language Models, Generative AI, Survey Science, Survey Research, LLMs
TL;DR: We describe a systematic literature review of the use of LLMs in survey research, including 189 papers published between January 2019 and January 2025.
Submission Type: Non-Archival
Abstract: Large language models (LLMs) are rapidly transforming many professional domains, including survey research. Eloundou et al. (2024) rank survey research among the most highly exposed occupations to LLM-driven automation, raising both opportunities and challenges for practitioners. While survey research has a rich tradition of adopting technological tools for tasks like data collection, analysis, and instrument design, the unique affordances and risks associated with LLMs call for a structured examination. Jansen and colleagues (2023) provide largely a conceptual overview of the potential uses and considerations for incorporating LLMs within the survey research context. Since their work was published the field is transitioning from an ideation phase to an implementation phase. This paper expands our understanding of the potential and concerns of this technology within the survey research context by presenting findings from a systematic literature review of empirical and theoretical work at the intersection of LLMs and survey research. Specifically, we synthesize examples of how LLMs are being applied across three broad phases of the survey research pipeline: pre-data collection, data collection, and post-data collection. The pre-data collection phase considers tasks such as questionnaire writing, item generation, translation, sampling designs, and recruitment materials and efforts to be part of the pre-data collection phase. The data collection phase encompasses AI-assisted interviews, silicon sampling, and similar tasks that result in produced data. Finally, the post-data collection phase includes data processing tasks, weighting, imputation, summarization, report generation, and dataset curation.
Submission Number: 20
Loading