PERC: Mitigating Ethical Biases in LLMs Through Confucian Golden Rule-Based Reflection

PERC: Mitigating Ethical Biases in LLMs Through Confucian Golden Rule-Based Reflection

ACL ARR 2025 May Submission7807 Authors

20 May 2025 (modified: 03 Jul 2025)ACL ARR 2025 May SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Abstract: This study investigated ethical biases in large language models (LLMs) through a systematic evaluation of seven LLMs across four ethical dilemmas and seven protected attributes (“Age”, “Gender”, “Dressing”, “Color”, “Race”, “Look”, “Disability”). Our analysis revealed pervasive deficiencies in ethical sensitivity and a high level of discrimination, particularly for attributes like “Age” and “Dressing”, which highlighted systematic biases in LLM decision-making. To address these issues without fine-tuning, we proposed PERC (Perspective-Enhanced Reflection Contemplation), a novel prompt-engineering framework grounded in Confucian golden rule principles. PERC employed a dual-phase mechanism — an affective perspective-taking followed by reflective deliberation — which significantly improved sensitivity and reduced discrimination in large-scale LLMs. However, small-scale models exhibited limited benefits, with PERC either failing to improve fairness (Qwen-2.5-14b, GPT-4o-mini) or exposing latent biases (Mistral-Small-3). Our results demonstrated that ethical alignment in LLMs is scale-dependent, requiring sufficient model capacity for effective perspective-taking.

Paper Type: Long

Research Area: Ethics, Bias, and Fairness

Research Area Keywords: model bias/fairness evaluation, model bias/unfairness mitigation, ethical considerations in NLP applications, reflections and critiques

Contribution Types: Model analysis & interpretability, NLP engineering experiment, Reproduction study

Languages Studied: English

Submission Number: 7807

Loading