MERMADE: $K$-shot Robust Adaptive Mechanism Design via Model-Based Meta-Learning

Arundhati Banerjee; Soham Rajesh Phade; Stefano Ermon; Stephan Zheng

MERMADE: $K$-shot Robust Adaptive Mechanism Design via Model-Based Meta-Learning

Arundhati Banerjee, Soham Rajesh Phade, Stefano Ermon, Stephan Zheng

Published: 01 Feb 2023, Last Modified: 13 Feb 2023Submitted to ICLR 2023Readers: Everyone

Keywords: Mechanism design, Robustness, Meta-learning, Adaptive agents, Simulation based learning

Abstract: Mechanism design (MD) studies how rules and rewards shape the behavior of intelligent agents, e.g., in auctions or the economy. Simulations with AI agents are powerful tools for MD, but real-world agents may behave and learn differently than simulated agents under a given mechanism. Also, the mechanism designer may not fully observe an agent's learning strategy or rewards, and executing a mechanism may be costly, e.g., enforcing a tax might require extra labor. Hence, it is key to design robust adaptive mechanisms that generalize well to agents with unseen (learning) behavior, are few-shot adaptable, and are cost-efficient. Here, we introduce MERMADE, a model-based meta-learning framework to learn mechanisms that can quickly adapt when facing out-of-distribution agents with different learning strategies and reward functions. First, we show that meta-learning allows adapting to the theoretically known and appropriate Stackelberg equilibrium in a simple matrix game at meta-test time, with few interactions with the agent. Second, with bandit agents, we show empirically that our approach yields strong meta-test time performance against agents with various unseen explore-exploit behaviors. Finally, we outperform baselines that separately use either meta-learning or agent behavior modeling to learn a cost-effective mechanism that is $K$-shot adaptable with only partial information about the agents.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics

Submission Guidelines: Yes

Please Choose The Closest Area That Your Submission Falls Into: Reinforcement Learning (eg, decision and control, planning, hierarchical RL, robotics)

TL;DR: We propose MERMADE, a deep RL approach to mechanism design that learns a world model together with a meta-learned mechanism which can be quickly adapted to perform well on unseen test agents that learn.

Supplementary Material: zip

22 Replies

Loading