Towards Defining Deception in Structural Causal Games

Francis Rhys Ward

Towards Defining Deception in Structural Causal Games

Francis Rhys Ward

Published: 05 Dec 2022, Last Modified: 05 May 2023MLSW2022Readers: Everyone

Abstract: Deceptive agents are a challenge for the safety, trustworthiness, and cooperation of AI systems. We focus on the problem that agents might deceive in order to achieve their goals. There are a number of existing definitions of deception in the literature on game theory and symbolic AI, but there is no overarching theory of deception for learning agents in games. We introduce a functional definition of deception in structural causal games, grounded in the philosophical literature. We present several examples to establish that our formal definition captures philosophical and commonsense desiderata for deception.

1 Reply

Loading