Non-Cooperative Inverse Reinforcement Learning

Xiangyuan Zhang; Kaiqing Zhang; Erik Miehling; Tamer Basar

Non-Cooperative Inverse Reinforcement Learning

Xiangyuan Zhang, Kaiqing Zhang, Erik Miehling, Tamer Basar

06 Sept 2019 (modified: 05 May 2023)NeurIPS 2019Readers: Everyone

Abstract: Making decisions in the presence of a strategic opponent requires one to take into account the opponent’s ability to actively mask its intended objective. To describe such strategic situations, we introduce the non-cooperative inverse reinforcement learning (N-CIRL) formalism. The N-CIRL problem consists of two agents with completely misaligned objectives, where only one of the agents knows the true reward function. Formally, we model the N-CIRL problem as a zero-sum Markov game with one-sided incomplete information. Through interacting with the more informed player, the less informed player attempts to infer the true reward function. As a result of the one-sided incomplete information, the multi-stage game can be decomposed into a sequence of single-stage games. The theoretical results serve as a basis for the design of efficient algorithms for computing equilibrium strategies. The N-CIRL formalism has natural applications in cyber-security where a defender attempts to defend a system without perfect knowledge of the attacker’s intent.

CMT Num: 5045

0 Replies

Loading