Keywords: LLMs, survey research, data quality, representation, measurement
TL;DR: A detailed overview of potential applications of LLMs in survey research as well as discussion of how such applications and LLM idiosyncracies can mitigate as well as increase survey data quality in the context of established error types.
Submission Type: Archival
Abstract: The integration of large language models (LLMs) into surveys presents opportunities for mitigating ongoing challenges regarding coverage, sampling, measurement, and nonresponse, all the while making survey research more efficient. However, LLMs can also introduce new challenges. As LLMs have only emerged rather recently as a potential tool in the survey methodologists’ toolbox, how their use can improve versus worsen survey data quality has not been systematically investigated. In this paper, I present an overview of the potential applications of LLMs in survey research, and highlight their possible pitfalls. I identify three main roles LLMs can play in the survey research process: they can act as research assistants, interviewers, and respondents, with potential applications in all stages of the survey research process. I also discuss how LLM training, alignment, and model architectures, as well as research design choices can inhibit survey data quality, concluding that LLM-induced errors need to be investigated both methodologically and empirically, and that, short of mitigating ensuing biases, humans need to remain in the loop.
Submission Number: 36
Loading