Keywords: game-theoretic learning, learning dynamics, mechanism design, algorithmic monoculture, multi-agent learning, strategic AI
TL;DR: Low-regret AI agents can still form a brittle strategic monoculture; we define strategic collision and propose Eco-MD to reduce transferable exploitability in agent populations.
Abstract: Modern AI agents increasingly act in strategic populations rather than as isolated predictors. When many agents share prompts, objectives, training recipes, or tool-use controllers, low individual regret can coexist with deployment-level monoculture: one transferable behavioral surface exposed across the population. We study this Clone Game. A simple separation shows that maximum welfare, zero Nash gap, and zero external regret can coexist with maximal collision and single-probe attack success under a surface-specific exploit model. We formalize this risk through strategic collision, propose ecological stability as a population-level refinement of regret and equilibrium gap, and introduce Ecological Mirror Descent (Eco-MD), a multiplicative-weights update with a population-rarity bonus and an anonymous private niche. We prove that decaying ecological pressure preserves no-regret learning up to an explicit additive term, and that collision is exactly the expected transfer rate for action-surface exploits. Across six stationary matrix-game benchmarks, Eco-MD reduces average attack success from .613 to .472 and collision from .569 to .430 relative to Hedge while retaining 94% of Hedge's welfare. A transparent non-stationary stress test further illustrates lower post-shift probeability. Together, the separation, metric, and learner show that strategic AI deployments should be evaluated as populations, not only as isolated learners.
Email Sharing: We authorize the sharing of all author emails with Program Chairs.
Data Release: We authorize the release of our submission and author names to the public in the event of acceptance.
Paper Type: Standard paper
Submission Number: 18
Loading