How well do simulated populations with GPT-4 align with real ones in clinical trials? The case of the EPQR-A personality test
Keywords: LLM, GPT, Eysenck Personality Questionnaire-Revised (EPQR-A)
TL;DR: We test if GPT-4o can simulate populations for clinical trials and performed two experiments, with the Eysenck Personality Questionnaire-Revised (EPQR-A) in three different languages (Spanish, English, and Slovak).
Abstract: In this paper, we test if GPT-4o can simulate populations for clinical trials. We performed two experiments, with the Eysenck Personality Questionnaire-Revised (EPQR-A) in three different languages (Spanish, English, and Slovak). Our results show that GPT-4o displays specific personality traits which may vary depending on different parameter settings and questionnaire language. Furthermore, the question of whether simulated populations (mimicking real ones) can be created and used for testing questionnaires is still inconclusive.
While we find encouraging results in some personality traits and differences between genders and study fields, we also observe that results for the virtual population answering the questionnaire differ from the ones found in real populations. Accordingly, further research is needed to test how to reduce the differences between virtual and real populations.
Submission Number: 35
Loading