Litmus Tests—Quantifying How LLMs Trade Off Competing Objectives: Economic Decisionmaking as a Case Study
Keywords: large language models, economics, behavioral economics, tradeoff, collusion, litmus tests
TL;DR: We introduce *litmus tests*, evaluations for LLMs that quantify differences in character, values, and tendencies in economic decisionmaking settings.
Abstract: Many key real-world decisions involve tradeoffs with no objectively correct answer. As LLMs are increasingly utilized in decision-making tasks, it becomes increasingly relevant to evaluate LLMs' behavioral tendencies when faced with such tradeoffs. To this end, we introduce *litmus tests*, a new kind of quantitative measure for LLMs. Litmus tests quantify differences in character, values, and tendencies of LLMs, by considering their behavior when faced with tradeoffs. We construct litmus tests to measure LLM tendencies when faced with each of three fundamental economic tradeoffs—efficiency versus equality, patience versus impatience, and collusiveness versus competitiveness—and find that our litmus tests differentiate LLMs across meaningful dimensions besides raw capability.
Primary Area: alignment, fairness, safety, privacy, and societal considerations
Submission Number: 21349
Loading