LLMs and Personalities: Inconsistencies Across Scales

Tosato Tommaso; Mahmood Hegazy; David Lemay; Mohammed Abukalam; Irina Rish; Guillaume Dumas

LLMs and Personalities: Inconsistencies Across Scales

Tosato Tommaso, Mahmood Hegazy, David Lemay, Mohammed Abukalam, Irina Rish, Guillaume Dumas

Published: 10 Oct 2024, Last Modified: 01 Nov 2024NeurIPS 2024 Workshop on Behavioral MLEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Personality Assessment, Model Scaling, Persona Prompting, Psychometric Testing, LLMs, BFI, EPQ-R

Abstract: This study investigates the application of human psychometric assessments to large language models (LLMs) to examine their consistency and malleability in exhibiting personality traits. We administered the Big Five Inventory (BFI) and the Eysenck Personality Questionnaire-Revised (EPQ-R) to various LLMs across different model sizes and persona prompts. Our results reveal substantial variability in responses due to question order shuffling, challenging the notion of a stable LLM "personality." We find that larger models demonstrate more consistent responses across most personas, though this scaling behavior varies significantly by trait and persona type. The assistant persona showed the most predictable scaling patterns, while clinical personas exhibited more variable and sometimes extreme trait expressions. Including conversation history unexpectedly increased response variability. These findings have important implications for understanding LLM behavior under different conditions and reflect on the consequences of scaling.

Submission Number: 75

Loading