Abstract: Questionnaires are a common method for detecting the personality of Large Language Models (LLMs). However, their reliability is often compromised by two main issues: hallucinations (LLMs produce inaccurate or irrelevant responses) and the sensitivity of responses to the order of the presented options. The personality of LLMs detected by these methods may contain some biases. To obtain more reliable results, we propose combining psychological feature analysis with questionnaires. By extracting psychological features from the LLMs' responses, this method can mitigate the impact of hallucinations. By normalizing the scores from both methods, this approach can produce more reliable results. We conduct experiments on pre-trained language models (PLMs), such as BERT and GPT, and chat models (ChatLLMs), such as ChatGPT. The results show that LLMs exhibit certain personality traits; for example, ChatGPT and ChatGLM are high scorers on the 'Conscientiousness' trait. Additionally, the results indicate that the personalities of LLMs are derived from their pre-trained data, and human preference alignment can help adjust the personalities of LLMs to more closely match the average traits of human personalities. We compare the results with the human average personality score and find that the personality of GPT-4 is most similar to that of humans, with a score difference of only 0.05.
Paper Type: Long
Research Area: Linguistic theories, Cognitive Modeling and Psycholinguistics
Research Area Keywords: cognitive modeling, computational psycholinguistics
Contribution Types: Data analysis, Position papers, Theory
Languages Studied: English, Chinese
Submission Number: 3702
Loading