Keywords: wisdom of crowds, LLM, multiagent systems
Abstract: In human society, collective decision making has often outperformed the judgment of individuals. Classic examples range from estimating livestock weights to predicting elections and financial markets, where averaging many independent guesses often yields results more accurate than those of experts. These successes arise because groups bring together diverse perspectives, independent voices, and distributed knowledge, so that idiosyncratic errors average out across independent estimates rather than compound. This principle, known as the Wisdom of Crowds, underpins forecasting in domains from finance to politics. Large Language Models (LLMs), however, typically produce a single definitive answer. While effective in many settings, this uniformity overlooks the diversity of human judgments that shapes how people respond to ads, videos, and webpages. Inspired by how societies benefit from diverse opinions, we ask whether LLM predictions can be improved by simulating many diverse answers rather than one. We introduce Social Agents, a multi-agent framework that instantiates a synthetic society of human-like personas with diverse demographic (e.g., age, gender) and psychographic (e.g., values, interests) attributes. Each persona independently appraises a stimulus such as an advertisement, video, or webpage, offering both a quantitative score (e.g., click-through likelihood, recall score, likability) and a qualitative rationale. The set of persona opinions mirrors a real human crowd, and aggregating them yields a single estimate closer to the crowd mean than any individual estimate. Across eleven behavioral prediction tasks, Social Agents outperforms single-LLM baselines by up to 164\% on simple judgments (e.g., webpage likability) and up to 24\% on complex interpretive reasoning (e.g., video memorability), both with GPT-4o as the backbone. Averaged across models, gains reach 30.5\% on low-level and 9.9\% on high-level tasks. The individual persona predictions generated by Social Agents also strongly align with human judgments, reaching Pearson correlations up to 0.71. These results position computational crowd simulation as a scalable, interpretable tool for improving behavioral prediction and supporting behavioral and marketing decisions
Supplementary Material: zip
Primary Area: foundation or frontier models, including LLMs
Submission Number: 16986
Loading