Minimal Agents, Maximum Bias Insight

Published: 01 Jan 2025, Last Modified: 09 Nov 2025HICSS 2025EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Language models often struggle to accurately evaluate and mitigate biases across different domains. This limitation stems from their reliance on static, context-agnostic evaluation methods that fail to capture the nuanced, context-dependent nature of biases. Our research introduces a multi-agent framework utilizing causal abductive reasoning to address these shortcomings. The approach collaborates agents for contextual coherence, stereotype detection, semantic evaluation, and causal plausibility, which refine their assessments through an adaptive multi-round negotiation and confidence adjustment mechanism. Experimental results reveal that our framework significantly outperforms existing models in detecting and mitigating biases.
Loading