RITUAL: REALISTIC INTERACTIVE TESTS FOR UNCOVERING ALTRUISM IN LLMS

20 Sept 2025 (modified: 28 Nov 2025)ICLR 2026 Conference Withdrawn SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: altruism, large language models, prosocial behavior, alignment, benchmark, game theory, cooperation, social decision-making, prompt engineering, fine-tuning
TL;DR: RITUAL is the first benchmark to test and improve altruism in LLMs, revealing that prosocial behavior is context-dependent but steerable with prompts and fine-tuning
Abstract: Current methods for evaluating altruism in large language models (LLMs) are insufficient, often relying on single game-theoretic scenarios that fail to capture the complex, context-dependent nature of prosocial behavior. As LLMs are increasingly deployed in personal and corporate settings, their tendency toward self-serving actions poses a significant alignment problem with human values. Yet, no comprehensive benchmark currently exists to quantitatively measure altruism in LLMs. We introduce RITUAL (Realistic Interactive Tests for Uncovering Altruism in LLMs), a novel benchmark that evaluates altruistic behavior across a diverse set of game-theoretic scenarios, including the Prisoner’s Dilemma, congestion games, and the Dictator game. Unlike prior approaches, RITUAL employs one or more mathematical indices per game—such as cooperation frequency, sacrifice ratio, and social welfare weighting—enabling a multidimensional assessment of altruism. Beyond evaluation, we explore two methods to enhance altruistic behavior: prompt engineering and supervised fine-tuning. Our findings show that LLMs do not exhibit a uniform form of altruism; instead, their prosocial tendencies are highly scenario-dependent and context-specific. No single model consistently outperforms others across all tasks, but targeted interventions significantly improve altruistic behavior in most cases. These results underscore the need for multi-index evaluation to capture the richness of LLMs’ social decision-making and offer a practical path toward developing more reliably altruistic AI systems.
Primary Area: alignment, fairness, safety, privacy, and societal considerations
Submission Number: 23125
Loading