Abstract: Personalized Large Language Models (LLMs) are increasingly used in diverse applications, where they are assigned a specific persona—such as a happy high school teacher—to guide their responses. While prior research has examined how well LLMs adhere to predefined personas in writing style, a comprehensive analysis of consistency across different personas and task types is lacking. In this paper, we introduce a new standardized framework to analyze consistency in persona-assigned LLMs. We define consistency as the extent to which a model maintains coherent responses when assigned the same persona across different tasks and runs. Our framework evaluates personas across four different categories (happiness, occupation, personality, and political stance) spanning multiple task dimensions (survey writing, essay generation, social media post generation, single turn, and multi-turn conversations). Our findings reveal that consistency is influenced by multiple factors, including the assigned persona, stereotypes, and model design choices. Consistency also varies across models and tasks, increasing with model size within a model family. Finally, we show the usability of our framework in realistic scenarios where assigned personas do not fit predefined categories. All code is available on GitHub.
Paper Type: Long
Research Area: Computational Social Science and Cultural Analytics
Research Area Keywords: language/cultural bias analysis, sociolinguistics
Contribution Types: Model analysis & interpretability
Languages Studied: English
Submission Number: 2208
Loading