AREG: Adversarial Resource Extraction Game for Evaluating Persuasion and Resistance in Large Language Models

Adib Sakhawat; Fardeen Sadab; Tamjid Hasan Fahim

AREG: Adversarial Resource Extraction Game for Evaluating Persuasion and Resistance in Large Language Models

Adib Sakhawat, Fardeen Sadab, Tamjid Hasan Fahim

Published: 14 Jun 2026, Last Modified: 17 Jun 2026ICML 2026 Workshop MusIML PosterEveryoneRevisionsBibTeXCC BY 4.0

Keywords: large language models, persuasion, resistance to persuasion, adversarial dialogue, social influence, interactive evaluation, negotiation game, llm safety, dialogue systems, social engineering, behavioral evaluation, multi-turn interaction

TL;DR: AREG benchmarks persuasion and resistance in LLMs via adversarial financial negotiation. Across 280 games, offensive and defensive abilities were weakly correlated, with incremental persuasion and verification-based defense proving most effective.

Abstract: Evaluating LLM social intelligence requires moving beyond static text toward dynamic interactions. We introduce the Adversarial Resource Extraction Game (AREG), a benchmark operationalizing persuasion and resistance as a multi-turn, zero-sum financial negotiation. A tournament across frontier models reveals that offensive and defensive capabilities are empirically dissociated and weakly correlated ($\rho = 0.33$). While models show a systematic defensive advantage, effectiveness depends heavily on dialogue structure: incremental persuasion outperforms single asks, and verification-seeking defends better than explicit refusal. These findings demonstrate that social influence is not a monolithic capability, highlighting the need for dual-sided evaluation to uncover asymmetric behavioral vulnerabilities.

Track: Track 2: ML Research by Muslim Authors

Email Sharing: We authorize the sharing of all author emails with Program Chairs.

Data Release: We authorize the release of our submission and author names to the public in the event of acceptance.

Non Archival Confirmation: I understand that submissions to MusIML are non-archival and can be submitted to other venues.

Submission Number: 17

Loading