A Systematic Evaluation of Transformer-LM Representations for Capturing Author States and Traits

A Systematic Evaluation of Transformer-LM Representations for Capturing Author States and Traits

ACL ARR 2025 February Submission4870 Authors

16 Feb 2025 (modified: 09 May 2025)ACL ARR 2025 February SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Abstract: Large Language Models (LLMs) are increasingly used in human-centered applications, yet their ability to model diverse psychological constructs is not well understood. In this study, we systematically evaluate a range of Transformer-based LLMs to predict psychological variables across five major dimensions: affect, substance use, mental health, sociodemographics, and personality. Analyses span three temporal levels—short daily text responses, two-week, and user-level text collected over two years—allowing us to examine how each model’s strengths align with the underlying stability of different constructs. The findings show that mental health signals emerge as the most reliably captured dimension, possibly because people often use detailed, specific language when describing their emotional experiences, which makes these cues easier for models to detect. At the daily scale, context-rich embeddings of DeBERTa and HaRT excel at capturing short-term emotional fluctuations, whereas few-shot Llama3-8B proves particularly adept at modeling nuanced substance use behaviors at the two-week interval. Aggregating text over the entire study period yields stronger correlations for socio-demographic factors e.g., age, income. These results suggest actionable insights into the design of LLM-based approaches for psychological assessments, emphasizing the importance of selecting appropriate model architectures and temporal aggregation techniques to the stability and nature of the target construct.

Paper Type: Long

Research Area: Human-Centered NLP

Research Area Keywords: Psychological States, Psychological Dispositions, Psychological Traits, Human Behavior, Human-Centered NLP, Computational Social Science

Contribution Types: Model analysis & interpretability, Data analysis

Languages Studied: English

Submission Number: 4870

Loading