Manipulating the Perceived Personality Traits of Language Models

Published: 07 Oct 2023, Last Modified: 01 Dec 2023EMNLP 2023 FindingsEveryoneRevisionsBibTeX
Submission Type: Regular Long Paper
Submission Track: NLP Applications
Keywords: nlp
Abstract: Psychology research has long explored aspects of human personality like extroversion, agreeableness and emotional stability, three of the personality traits that make up the 'Big Five'. Categorizations like the 'Big Five' are commonly used to assess and diagnose personality types. In this work, we explore whether text generated from large language models exhibits consistency in it's perceived 'Big Five' personality traits. For example, is a language model such as GPT2 likely to respond in a consistent way if asked to go out to a party? We also show that when exposed to different types of contexts (such as personality descriptions, or answers to diagnostic questions about personality traits), language models such as BERT and GPT2 consistently identify and mirror personality markers in those contexts. This behavior illustrates an ability to be manipulated in a predictable way (with correlations up to 0.84 between intended and realized changes in personality traits), and frames them as tools for controlling personas in applications such as dialog systems. We contribute two data-sets of personality descriptions of humans subjects.
Submission Number: 3756
Loading