Pancasila-Dilemmas: Evaluating Large Language Models on Indonesian Human Value Dilemmas Grounded in Pancasila

ACL ARR 2026 January Submission3161 Authors

04 Jan 2026 (modified: 20 Mar 2026)ACL ARR 2026 January SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: LLMs, Indonesian, Human Value, Evaluation, Dataset
Abstract: The value alignment of large language models (LLMs) is crucial for ensuring responses align with human intention and value preferences. However, most evaluations of value alignment focus on Western or universal values, while assessments grounded in the value systems of specific countries remain scarce. In this paper, we introduce Pancasila-Dilemmas, an evaluation dataset of 1,834 questions derived from Indonesian news, classified by 5 values of Pancasila: Religion,Humanity, Unity, Democracy, and Social Justice. This dataset reflects daily life in Indonesia, making it suitable for measuring the value alignment of LLMs deployed for Indonesia. To ensure a more rigorous evaluation, we choose scenarios containing value dilemmas. The dataset is generated with LLMs in a multiple-choice format, consisting of a scenario, a question, and 4 options without right/wrong, proofread by native speakers. Furthermore, we propose Hard-Label and Soft-Label evaluations to capture the uncertainty of the LLMs. We evaluate 40 closed- and open-source LLMs on our dataset. Results reveal that all evaluated LLMs achieve less than 70% agreement with human answers. Further analysis shows that the Religion value is particularly challenging. We also observe instances where LLMs consistently agree with one another yet fail to match human answers. This highlights a significant gap in capturing Indonesian values. The data will be publicly released.
Paper Type: Long
Research Area: Resources and Evaluation
Research Area Keywords: corpus creation, benchmarking, language resources, safety and alignment, values and culture, human-centered evaluation, language/cultural bias analysis
Contribution Types: Model analysis & interpretability, Data resources, Data analysis
Languages Studied: Indonesian
Submission Number: 3161
Loading