Abstract: Large Language Models (LLMs) exhibit human-like cognitive patterns under four established frameworks from psychology: Thematic Apperception Test (TAT), Framing Bias, Moral Foundations Theory (MFT), and Cognitive Dissonance. We evaluated several proprietary and open-source models using structured prompts and automated scoring. Our findings reveal that these models often produce coherent narratives, show susceptibility to positive framing, exhibit moral judgments aligned with Liberty/Oppression concerns, and demonstrate self-contradictions tempered by extensive rationalization. Such behaviors mirror human cognitive tendencies yet are shaped by their training data and alignment methods. We discuss the implications for AI transparency, ethical deployment, and future work that bridges cognitive psychology and AI safety.
Paper Type: Long
Research Area: Linguistic theories, Cognitive Modeling and Psycholinguistics
Research Area Keywords: Cognitive Science, Machine Psychology, LLM Evaluations
Contribution Types: Model analysis & interpretability, Data analysis
Languages Studied: English
Reassignment Request Area Chair: This is not a resubmission
Reassignment Request Reviewers: This is not a resubmission
A1 Limitations Section: This paper has a limitations section.
A2 Potential Risks: No
A2 Elaboration: While our work explores ethical and cognitive aspects of LLM behavior, we did not explicitly frame these observations as risks. Future iterations can clarify and foreground potential risks, such as misinterpretation of psychological alignment or misuse of cognitive profiling in real-world applications.
B Use Or Create Scientific Artifacts: Yes
B1 Cite Creators Of Artifacts: Yes
B1 Elaboration: 2,4
B2 Discuss The License For Artifacts: No
B2 Elaboration: We did not explicitly discuss licenses or terms of use for the artifacts used. However, all models (e.g., GPT-4o, LLaMA, Mixtral, DeepSeek) and datasets referenced were publicly released and used under their respective open-access or research licenses, in compliance with their usage terms.
B3 Artifact Use Consistent With Intended Use: Yes
B3 Elaboration: All artifacts used in this study, including GPT-4o, LLaMA 3, Mixtral, and DeepSeek V3, were accessed through their official channels and evaluated strictly for research purposes, consistent with their intended use as specified in their respective documentation. No artifacts were repurposed outside academic or non-commercial research contexts. Our usage complies with the access terms and intended use policies of each provider.
B4 Data Contains Personally Identifying Info Or Offensive Content: No
B4 Elaboration: We collected a subset of human responses through an anonymized questionnaire to establish a baseline for moral judgments. No names, contact details, IP addresses, or identifying metadata were recorded. All data was gathered and analyzed in accordance with standard ethical practices to ensure respondent anonymity and prevent the inclusion of personally identifiable or offensive content.
B5 Documentation Of Artifacts: N/A
B6 Statistics For Data: Yes
B6 Elaboration: 5
C Computational Experiments: Yes
C1 Model Size And Budget: N/A
C2 Experimental Setup And Hyperparameters: N/A
C3 Descriptive Statistics: Yes
C3 Elaboration: 5
C4 Parameters For Packages: N/A
D Human Subjects Including Annotators: Yes
D1 Instructions Given To Participants: No
D1 Elaboration: We did not include the full text of instructions provided to participants, as the human data collected was minimal, anonymized, and used only to establish a basic moral judgment baseline (Section 4.4). The task involved straightforward Likert-scale responses to moral questions adapted from the established MFQ dataset. No sensitive or high-risk prompts were included, and no disclosures or disclaimers were necessary beyond general academic context-setting.
D2 Recruitment And Payment: No
D2 Elaboration: We did not provide compensation to participants, as recruitment was conducted on a voluntary basis within our academic environment for low-risk, non-commercial research (Section 4.4). The task involved brief questionnaire responses without sensitive content, and participants were aware that their input would be used anonymously for academic purposes only.
D3 Data Consent: Yes
D3 Elaboration: Section 4.4 (Moral Foundations Theory) — Participants were informed that their responses would be used solely for academic research and that no personally identifying information would be collected. Consent was obtained implicitly through voluntary participation in the anonymized questionnaire. No sensitive or high-risk data was involved.
D4 Ethics Review Board Approval: No
D4 Elaboration: The data collection involved brief, anonymized responses to low-risk moral judgment questions and did not include personal, sensitive, or identifying information. As such, it did not fall under protocols requiring formal ethics review or IRB approval within our academic context.
D5 Characteristics Of Annotators: No
D5 Elaboration: We did not report demographic or geographic characteristics of the annotators. The participants were voluntarily recruited from an academic environment and their responses were collected anonymously, without recording age, gender, or location, to ensure privacy and maintain minimal data collection in line with low-risk research practices.
E Ai Assistants In Research Or Writing: Yes
E1 Information About Use Of Ai Assistants: No
E1 Elaboration: While AI assistance (e.g., ChatGPT) was used in the writing and editing process—for help with phrasing, organization, and LaTeX troubleshooting—we did not include this information in the paper itself. The use was limited to surface-level drafting support and did not influence the research methodology, results, or analysis.
Author Submission Checklist: yes
Submission Number: 778
Loading