Keywords: Social simulation, Large language model, reliability
TL;DR: This paper aims to unveil and improve the reliability of LLMs in social simulation scenarios.
Abstract: Large Language Models (LLMs) are increasingly used for social character simulations, enabling applications in role-playing agents and Computational Social Science (CSS). However, their inherent flaws—such as inconsistencies in simulated roles—raise concerns about their reliability and trustworthiness. In this paper, we systematically investigate these flaws and explore potential solutions. To assess the reliability of LLM-based simulations, we introduce TrustSim, a benchmark dataset covering 10 CSS-related topics. Through experiments on 14 LLMs, we uncover persistent inconsistencies in simulated roles and find that higher general model performance does not necessarily correlate with greater simulation reliability. To mitigate these flaws, we propose Adaptive Learning Rate Based ORPO (AdaORPO), a reinforcement learning-based algorithm that improves simulation consistency across seven LLMs. Our study not only exposes critical weaknesses in LLM-driven social character simulations but also offers a pathway toward more robust and trustworthy simulations, laying the foundation for future advancements in this field.
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the COLM Code of Ethics on https://colmweb.org/CoE.html
Author Guide: I certify that this submission complies with the submission instructions as described on https://colmweb.org/AuthorGuide.html
Submission Number: 314
Loading