Position: LLM Social Simulations Are a Promising Research Method

Published: 01 May 2025, Last Modified: 18 Jun 2025ICML 2025 Position Paper Track posterEveryoneRevisionsBibTeXCC BY 4.0
TL;DR: LLM social simulations face five key challenges (diversity, bias, sycophancy, alienness, and generalization), and there are promising directions to address each of them.
Abstract: Accurate and verifiable large language model (LLM) simulations of human research subjects promise an accessible data source for understanding human behavior and training new AI systems. However, results to date have been limited, and few social scientists have adopted this method. In this position paper, we argue that the promise of LLM social simulations can be achieved by addressing five tractable challenges. We ground our argument in a review of empirical comparisons between LLMs and human research subjects, commentaries on the topic, and related work. We identify promising directions, including context-rich prompting and fine-tuning with social science datasets. We believe that LLM social simulations can already be used for pilot and exploratory studies, and more widespread use may soon be possible with rapidly advancing LLM capabilities. Researchers should prioritize developing conceptual models and iterative evaluations to make the best use of new AI systems.
Lay Summary: In recent years, artificial intelligence (AI) systems have become much more powerful and humanlike. This has led many researchers to test using AI systems, particularly large language models (LLMs) such as ChatGPT and Gemini, to simulate human research subjects in studies of human behavior. However, many researchers remain skeptical of this approach, and there have not been many applications of LLM social simulations beyond initial testing and proof of concept work. In this paper, we argue that five challenges (diversity, bias, sycophancy, alienness, and generalization) stand in the way of widespread use of LLM social simulations. These are significant challenges, but we see exciting opportunities for progress on each. Our argument builds on a literature review of studies run to date and related work. We identify promising directions, including context-rich prompting and fine-tuning LLMs with social science datasets. We believe that LLM social simulations can already be used for exploratory research and building new scientific theories. More widespread use in more applications may soon be possible. Researchers should prioritize developing conceptual models—better ways to make sense of these “digital minds”—and evaluations of simulations so that we can track AI capabilities over time. Accurate and verifiable LLM social simulations can help humanity navigate technological and social change, and they can provide data to train safe and beneficial AI systems.
Primary Area: Research Priorities, Methodology, and Evaluation
Keywords: LLM social simulations, sims, agents, machine learning, artificial intelligence, large language models, evaluation, fairness, economics, psychology, sociology
Submission Number: 70
Loading